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To my parents and to my sister. 
This book would not exist without you. 


Preface 


This book started as a collaboration between myself and students of the Mas- 
sachusetts Academy of Arts and Sciences: I wrote notes and they responded with 
questions and what they thought could be done better. One of the requests was 
for a preface to the book describing how best to read it. The reader might well be 
confused about why this is necessary: surely, one reads a book from left to right, 
top to bottom, starting at the beginning and finishing at the end? And, indeed, this 
is one possible way to read it, but it might not be the best one, particularly if this 
is the reader’s first foray into a mathematics text that is primarily proof-driven. 
Such a reader (but not only such a reader) might naturally consider some of the 
following. 


Should you read the proofs of statements in this book? 

How should you read proofs? Should you try to memorize them? 

Should you read every single chapter of this book? 

In each chapter, which exercises should you do? Should you try to do all of 
them, or should you prioritize in some way? 


There are no right or wrong answers to these questions. A reader who is primar- 
ily interested in dipping their toes into new mathematical waters and feeling the 
direction of the current—as opposed to diving in headlong in search of deeper 
mathematical understanding—can get an entirely valid (if somewhat superficial) 
experience of this book just by reading through the chapters but ignoring the proofs 
and exercises. 

However, for a reader looking for greater conceptual understanding, I would 
very strongly recommend a different approach. In the first place, I think that such 
a reader should read through every single proof in the text. There are a number of 
reasons for this. In the first place, mathematics is nothing without proofs: logical 
deduction guided by intuitive reasoning is at the heart of what mathematics is and 
it has been this way since the days of the ancient Greeks. Mathematical literature 
aimed at grade school students and even lower level undergraduate students very 
commonly ignores this, but I think it is a mistake—it deprives such students of 
witnessing what mathematics really is. I will not belabor this point too much, as 
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it has already been made by far more eloquent writers [10]. The second reason 
is more specific to this particular text: the proofs of basic results about linear 
fractional transformations can be both elegant and deeply enlightening—ideally, 
they should leave the reader not only with the feeling that, yes, these statements 
are true, but also give them a deep conviction as to why they are true. The careful 
reader may find that some basic ideas come up again and again in these proofs and 
hopefully this will provide some insight into how such results are discovered and 
how they can be reproduced. Readers without much experience in reading proofs 
may be well served by remembering the words of celebrated mathematician Paul 
Halmos [4]: 


Don’t just read it; fight it! Ask your own question, look for your own examples, discover 
your own proofs. Is the hypothesis necessary? Is the converse true? What happens in the 
classical special case? What about the degenerate cases? Where does the proof use the 
hypothesis? 


In my opinion, memorizing proofs is usually a waste of time. Human memory is a 
fickle thing and without substantial training, it is difficult to memorize something 
verbatim without fear that it will not morph into something different with the 
passage of time. This is perhaps not so great a concern when memorizing a poem 
or novel—a misremembered word is unlikely to drastically change the meaning. 
In mathematics, however, changing any part of a proposition is highly likely to 
produce something blatantly wrong or simply word salad. A reader who wants to 
actually learn a theorem should proceed in a different way: strive to understand 
the theorem in its totality. This means: 


Understand the statement of the theorem. 
Boil down the proof of the theorem down to its essential ideas. 
Connect the theorem and its proof to other theorems and concepts that you have 
learned. 
© Convince yourself that this theorem was the right thing to have written down. 


I guarantee that any theorem that has sunk into your bones in this manner is a 
theorem that you will never forget, and it will instead become a foundation upon 
which you can continue building. For an even greater understanding, I recommend 
performing a similar analysis for the definitions in the book: try to understand not 
just what they say but why these definitions were chosen. 

I wrote this book with the intention of fostering mathematical growth. The exer- 
cises in this book are written accordingly and are organized into sections at the 
end of each chapter. The difficulty of the exercises varies greatly: some are very 
simple continuations of proofs written out in the main text; others ask for proofs of 
entirely new results, but broken down into many steps to guide the reader through 
the process; still others ask for entirely new proofs without any guidance. Depend- 
ing on the mathematical maturity of the reader, these exercises will range from 
essentially trivial to deeply challenging. Being unable to do all of the exercises 
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should not be taken as a sign of defeat but as a chance for continued growth. I 
would recommend going through the exercises that cover some missing pieces in 
the exposition of the main text—these can be easily identified by the fact that they 
are cited as “(See Exercise xxx)” in the text. In particular, wherever a proof in the 
main text is left as an exercise to the reader, that is something that should be prior- 
itized. On the other hand, for readers looking to avoid busywork, I recommend the 
following litmus test to decide whether you should skip a problem: when you look 
at it, is it obvious to you how to solve it? Do you understand it well enough that 
you could explain to another person how to do it? If the answer to both questions 
is an honest “yes”, then I think that skipping the exercise is permissible. If you 
are unsure, try to find a friend and explain to them your reasoning. If all of your 
friends are busy and/or don’t want to hear about math, a rubber duck will usually 
do in a pinch. 

If you are unfamiliar with writing mathematical proofs and you find yourself 
struggling with the exercises as a result, there are various excellent resources out 
there that might be of help. There is, for example, Polya’s classic book How to 
Solve It [12] which describes various mathematical strategies that exist and how 
one can implement them. At the time of writing, the Art of Problem Solving 
maintains a helpful wiki and forums for discussing problems and posting solutions; 
the same company has some helpful books geared toward particular subject areas 
such as algebra, precalculus, and others. There are many, many other books and 
websites out there. However, even with all of these aides, learning proof-writing 
is challenging, and it is important to remember that is okay. 

Above all, have fun! This is a truly wonderful subject and it deserves to be 
enjoyed. Play with it, explore, and I wish you good hunting. 


Southborough, USA Arseniy Sheydvasser 
May 2022 
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Euclidean Isometries and Similarities 


In which we think deeply of simple 
things. 


Arnold Ross (paraphrased). 


This is a book all about functions of the form 

az+b 

cz+d’ 

where a, b, c,d are complex numbers. Such functions are usually known as either 
linear fractional transformations, or sometimes Mobius transformations. Our goal 


for this chapter is to understand intimately the simplest kind of linear fractional 
transformations, where c = 0 and d = 1—that is, functions of the form 


pz) = 


pz) =az +b, 


which are usually called (complex) affine maps. We will see that these transfor- 
mations will describe isometries and similarities of the Euclidean plane, and we 
will make good use of this to prove some basic geometric theorems. Before we do 
that, however, we should remind ourselves of the basics of Cartesian geometry as 
expressed in terms of complex numbers. 


1.1 The Complex Plane and Affine Maps 


Usually, one describes the Cartesian plane in terms of pairs of real numbers (x, y). 
However, for our purposes, it is more convenient to write everything in terms 
of complex numbers z = x +iy—here x = K(z) is the real part and y = 3(z) is the 
imaginary part. This has the immediate benefit of making various definitions 
more compact. For example, we know that the distance between two points (x1, y1), 
(x2, y2) in Cartesian geometry is 
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y z=5+21 


Fig. 1.1 A point z in the complex plane and the angle 6 between the +-x-axis and z. 


guctid (41, Y1), 2, 2)) = Je =m) + Ge =)". 


Using complex numbers, this can be phrased instead as dguctia(Z1, 22) = |Z1 — Z2I, 


where 
ey Sapa 


is called the norm. The norm is particularly easy to think about because |z|* = zz, 
where x + iy = x — iy is the complex conjugate. This immediately implies that the 
norm is multiplicative—that is, for all z}, z2 € C!, |z1z2| = |z1||z2l. 

Complex numbers make it very convenient to describe translations. Specifically, 
a translation is just a transformation of the form z +> z+ Zo for some zg € C. Why is 
this? Well, suppose our translation is supposed to shift everything in the x-direction 
by xo, and in the y-direction by yo. For any z = x + iy, 


z+z0= (+ x0) +0 + yo)i, 


which is exactly the desired effect. 

The description we have given describes complex numbers in terms of Cartesian 
coordinates. Alternatively, any point z in the complex plane can be specified by its 
distance r away from the origin and the angle @ between the rays through | and z. 
By basic trigonometry, z = r cos(@) + i sin(@)—-see Figure 1.1 for an illustration. 
This can be written in an equivalent way using Euler’s formula that 


cos(#) +7 sin(@) = é’, 


which is not particularly hard to prove if you are familiar with Taylor series. (See 
Exercise 1.3.1.) Therefore, any complex number z can either be written as x + iy 
or as re’? where r = |z| = /x?2 + y? is the distance from the origin, and @ is the 
angle between the rays through | and z. This makes it very easy to describe rotations: 
concretely, z +> e'%z is a rotation by 6 radians counter-clockwise around the origin. 
Why is this? Well, if z = re’, then 


'T will be making frequent use of set-theoretical notation in this book. If the reader is unfamiliar 
with it, I strongly recommend looking at Appendix A. 
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Fig.1.2 Anillustration of the proof of Theorem 1.1 for the specific case of the map z t+ del T3241, 
(a) shows an initial configuration; (b) shows the effect of z > e'7/3z on (a); (c) shows the effect of 
Zt 2/3z on (b); finally, (d) shows the effect of z +> z+ 1 on(c). 


ez = rei%el? = reilot) | 


which is indeed a rotation. 
One final kind of transformation that is easy to describe are dilations. 


Definition 1.1 A dilation of C is a transformation of the form z +> rz for some 
r>0. 


Intuitively, a dilation is rescaling or a “zoom” of C. Dilations are the final ingre- 
dient we need to describe all complex affine maps. 


Theorem 1.1 (Composition Theorem for Complex Affine Maps) 
Let a, b be complex numbers with a # 0. Let 


y:Co-C 
zreeazt+b. 


Then p is a composition of a rotation, a dilation, and a translation. 


Remark 1.1 The restriction that a 4 0 is important, since otherwise y simply maps 
all of C to a single point b. 
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Proof First, we write a = rel? Then, we define three maps 


yi(z) = ez 
poz) = rz 
y3(z) =z +b. 


It is easy to see that »p = —3 o Y2 o yY;— indeed, 


93(y2(1(2))) = ¢3(y2(e!8z)) = gare!) = rel’z + b = az +b = G2). 


This decomposition is illustrated in Figure 1.2. oO 


Philosophical Principle 


This basic result showcases a technique that we will see over and over again: if 
you want to understand some kind of mathematical structure, try to break it into 
simple pieces that are easy to analyze, then see how you can put this information 
together. 


p> Example Find a,b € C such that p(z) = az +b moves | toi andi tol +i. 
Decompose ¢p as a rotation, dilation, and translation. 
Since p(1) = a+b =i and yi) = ai +b = 1 +i, we see that a(@i — 1) = 1, so 
a=1/G@-1)=1/@-1)-@+ )/@+ 1) = —@ + 1)/2. From this, we get that 
b=i-az=i+4+1)/2= GCi+1)/2. 

Next, we must write a in the form re’’. We have r = |— (i+ 1)/2| = |i+1|/4 = 
./2/4. To calculate 0, we note that 
rsin(@)  —1/2 
rcos(@) —1/2 ” 
and since —(i + 1)/2 is in the third quadrant of the complex plane, it follows that 
6=27+7/4 = 57/4. Therefore, y can be decomposed as first a rotation counter- 
clockwise by 57/4 radians, then a dilation by /2/4, and finally a translation by 
(i + 1)/2. 


tan(@) = 


1.2 Isometries 


With Theorem 1.1 as our launch point, we shall now endeavor to classify all of 
the affine maps z +> az + b by sorting them into various types of transformations 
depending on what types of properties they preserve. We begin this quest by talking 
about one of the most fundamental types of transformations in geometry: isometries. 
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Definition 1.2 A function ‘¥ : C —> C is an isometry if it does not change the 
distance between points—that is, if 


dguciiaY (21), Y (Z2)) = deuctia(Z1, 22) 
for all z1,z2 € C. 


While we have only defined isometries of C, the concept is much more broadly 
applicable. It can be defined for any metric space—roughly speaking, any set together 
with a distance function satisfying a few reasonable assumptions. We will delay 
discussing this general theory for now; we will return to it in Chapter 4. 

Intuitively, we say that isometries are those functions that preserve distance, mean- 
ing precisely that although they may move points to other points, they do not change 
the distances between different points. This is an idea that will come up over and 
over again in this book: when you have some families of transformations, try to study 
what it is that they preserve. 

Since dguctia(Z1, 22) = |Z1 —Z2|, if V is anisometry of C, then |‘¥(z1) — ‘¥(z2)| = 
|z1 — z2| for all zj, z2 € C. Conversely, if |‘¥(z1) — ‘Y(z2)| = |z1 — Z2| for all 
Z1, 22 € C, then ’ is an isometry. We will make use of this observation to make the 
proofs of various statements more convenient, such as the following. 


Lemma 1.1 Rotations around the origin and translations are isometries. 


Proof Let y(z) = e!°z for some angle @. We check directly that 


iO iO i 
lo(z1) — p(z2)| = Jez —e’ z9| —— le! (z1 — z2)| 
iO 
= le ||z1 — z2| = |z1 — zal. 
Thus, rotations are isometries. The case for translations is even easier: let p(z) = z+b 
for some complex number b; then 


ly(z1) — v(z2)| = lei +B -— 21 — | = [21 — Za, 


directly showing that it is an isometry. oO 


One of the key facts about isometries is that composing them together gives you 
another isometry. An example of this is provided in Figure 1.3. 


Theorem 1.2 /f ‘¥,, V2 are isometries, then ‘¥, o P> is an isometry. 


Remark 1.2 Although we only prove this for functions C — C, this is true for 
isometries in general, and with effectively the same justification. 


Proof We simply check the definition—for any z1, z2 € C, 
Apuciid (P1 (P2(z1)), Pi (P2(z2))) = deuctia (¥2(z1), ‘Y2(z2)) 
= dBuclid (Z1, 22) » 


where we first used that ‘¥; is an isometry and then that ‘¥2 is an isometry. oO 
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eer 


(a) (b) 


(c) (d) 


Fig. 1.3 An illustration of the composition of two isometries. (a) shows an initial configuration; 
(b) shows the effect of an isometry ‘¥; on (a); (c) shows the effect of an isometry ‘2 on (a); (d) 
shows the effect of ‘Y2 o ‘¥; on (a). 


Animmediate corollary of this is that any combination of translations and rotations 
about the origin is an isometry. In particular, all affine maps of the form z +> e!°z +b 
are necessarily isometries by Lemma 1.1. On the other hand, we shall shortly see 
two things: first, not all affine maps are isometries; second, not all isometries are 
affine maps. Let’s begin with the latter assertion—we will cover the former in the 
next section. 


Lemma 1.2 Complex conjugation z +> Z is an isometry, but it is not an affine map. 


Proof Note that for any z,, z2 € C, 


Iz1 — z2| = J@ = 20)(z1 = 22) = ley =a; 


so z +» Zis anisometry. Now, suppose that there exist a, b € C such that z = az+b 
for all z € C. If we evaluate at z = 0, we get that b = 0. If we evaluate at z = 1, we 
get that a = 1. But it is certainly false that z = z for all complex numbers z. Oo 


What sort of an isometry is z +> Z? Since x +iy = x — iy, we see it is just a 
reflection across the real line! We will see later that, in general, reflections are never 
affine maps. On the other hand, this map z + Z is in some sense the only obstruction 
preventing affine maps from describing all Euclidean isometries. 


1.3. Similarities 7 


Fig. 1.4 The effect of a similarity on a triangle. 


> Example Find an isometry ¥ such that '¥(0) = 1, V0) = 1+i, ¥@) =2. 
Unfortunately, we don’t know of very many isometries yet, so we will simply guess 
that we can find one of the form z + az+borzt> az+b. Ineither case, the fact that 
Y (0) = 1 forces b = 1, and the fact that ¥(1) = a+1 = 1+ forcesa = i. However, 
if it were the case that ¥(z) = iz + 1, then P(i) = i? +1 = 0 F 2. So, if this 
approach works at all, then it must be that ‘¥(z) = iz+ 1. Since ¥(i) = ii + 1 = 2, 
this is a valid solution. 


1.3 Similarities 


It is easy to see that dilations y(z) = rz are not isometries unless r = 1. Indeed, 


dguciia (PL), P(O)) = deuctia (1% 0) =r A 1 = deuctia (1, 0). 


However, while such transformations don’t preserve distances, they do preserve ratios 
between distances—that is, they are similarities. 


Definition 1.3 A function ¥ : C —> Cis a similarity if it does not change the ratio 
between distances—that is, for all distinct triples of points z;, z2, z3 € C, 

dguctid(Y (Z1), ¥(Z2)) _ deuctia (Z1, 22) 

dpuctia(¥ (1), ¥(Z3)) — deuctia(Z1, 23) 


One can define similarities in the same generality as one can define isometries. 
Surprisingly, similarities that are not isometries are comparatively rare—for example, 
an exercise in one of the later chapters shows that a function on hyperbolic space is 
a similarity if and only if it is an isometry! 
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It is easy to see that any isometry is a similarity. Are there similarities that are 
not isometries? Yes: Figure 1.4 illustrates an example. More explicitly, take any 
non-trivial dilation. 


Lemma 1.3 Dilations are similarities. Non-trivial dilations are not isometries. 


Proof Let y(z) = rz for some r > 0, and check 
lei) — p2)| — Irzi—rzal _ Irilzi—z2] _ [zi —zal 
le(z1) — p(z3)| Irzi—rz3lIrilzi—z31 [za — zal’ 
which works for all z;, z2, z3 € C. Therefore, y is a similarity. On the other hand, 


as we already remarked at the beginning of this section, non-trivial dilations (i.e. 
dilations that are not the identity map z +> z) are not isometries. oO 


Can we produce more examples of similarities? Absolutely: we can build them 
from other similarities and isometries, as we shall now show. 


Lemma 1.4 Let '¥ be a similarity. Then there exists a constant c (called the constant 
of proportionality) such that for any two points Z1, z2 € C, 


me dryctiaC¥ (z1), ¥ (Z2)) 
dguctid(Z1,22) 


Proof For any two points z1, z2 € C define 


deuciia(¥ (21), ¥(@2)) |W G1) — ¥@2)I 

dgnctid(Z1,22) = (Zi —Zal 
We need to prove that \,,,-, is the same for any choice of z;, z2. We do this in the 
following way—we first prove that \,,,-, = Az,,z, for any three points z), z2, z3. To 
see that this is true, notice that 


z1,22 — 


[P(z1)—P (z2)| IY (z1)—¥ (z2)| 
Ane i=l PFE) 14 
~~ [P(z1)—P(z3)| [z1—za| oe 
Azi,za |z1—z3| |z1—z3| 


where we used that 'P is a similarity. Additionally, it is easy to check that A,,., = 
Xz,z,- But now, this means that for any two pairs of points z;, z2 and z3, z4, we 


can conclude that rz, 25 = Azz, = Avgz.zy = Azz,z4- Consequently, we can simply 
write A = ,,,-,, secure in the knowledge that \ doesn’t depend on the choice of z 
Or Z2. Oo 


Lemma 1.5 Let'Y,, ‘¥2 be similarities withconstants ofproportionalityc, cz, respec- 
tively. Then ‘¥, 0 ‘Po is also a similarity with a constant of proportionality c\c2. 


Proof {leave the proof to the reader. (See Exercise 1.2.3.) oO 


Theorem 1.3 Let Y be a similarity. There exists some constant r > 0 and an 
isometry wW such that ¥ = po w where —p(z) = rz. 
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Fig. 1.5 An illustration of the effect of the map z all 4g 4+ 5 on the complex plane. 


Proof Let r be the constant of proportionality of ‘Y, and define y(z) = rz. Consider 
the function 7) = y~! o ¥—if we prove that this is an isometry, we will be done. To 
show this, consider the ratio 


(1) — v@)| — le #1) -— 9 TO ))| 
Iai —z2l Iz1 — Z2| 
_ [ro ¥@1) —r 8) 
7 IZ1 — Zl 
Ir" I|¥(z1) — P(z2)I 
IZ1 — Zal 
_ La ea) 
ft Iz1 — Z2| ie 


=1, 


where we used at the end the definition of the constant of proportionality. Thus, we 
have shown that for all z1, z2, |¢(z1) — W(Z2)| = |z1 — Z2|, which is to say that ~ is 
an isometry. Ergo, ¥ = yo v, as desired. oO 


This gives us an intuitive picture of what similarities are: they are just like isome- 
tries, except that they allow us to rescale everything by some constant factor. Much 
like isometries, similarities are crucially important in Euclidean geometry. For exam- 
ple, we will prove later that two triangles are similar if and only if there is a similarity 
that takes one to the other. We also now have enough information to be able to char- 
acterize affine maps. 


Theorem 1.4 Let a,b be complex numbers with a # 0. Then both the functions 
Zhe az+bandz+ az + bare similarities. 


Remark 1.3 An example of a similarity of this form is provided in Figure 1.5. 
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Proof We know from Lemma 1.1 that any map y(z) = az + b is a composition 
of a rotation, a dilation, and a translation—we now know that those are all similar- 
ities, hence y is a similarity. On the other hand, az + b = y(Z), so this is just the 
composition of a similarity with the similarity z +» Z; it must also be a similarity. 0 


This fact serves a dual purpose: on the one hand, it gives an intuitive idea of 
what affine maps are. On the other hand, it gives an algebraic description of (some) 
similarities. Both of these are useful, and allow us to jump between algebra and 
geometry as necessary. 


> Example Compute the constant of proportionality of the transformation p(z) = 
(1+ 2i)z+3—-i. 

Since we can use any two points to compute the constant of proportionality, it 
behooves us to choose the two simplest points: 0 and 1. Then we note that if c 
is the constant of proportionality, then 


c= PONY 1421) 143-1 = (1421)-043- 
= (1227) a 75. 


(See also Exercise 1.2.4.) 


> Example Find a similarity that takes a triangle with vertices at 0, 1,2 + 2i toa 
triangle with vertices at 2 — i, 3 — 3i, 8 — 3i. 

It is a good idea to graph these two triangles to confirm that they are actually similar 
and to figure out which vertices should correspond to which. 


We see that our similarity should send0 br 2—i, 1h 3—3i,and2+2i / 8—3i. 
We shall try to find a, b such that y(z) = az + b has the desired effect. (We do not 
know at this time that this is guaranteed to work, but we don’t know of any other 
way to create similarities.) Since 0 > 2 —i, we must have y(0) = b = 2 —i. Since 
1+ 3 — 3i, we must have 
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g(1) =a-14+2-i=a42-i=3-3i, 


whence a = 3 — 3i — (2 —i) = 1 — 2i. It remains to confirm that the last point is 
sent to the right place. Indeed, 


eO+2)=0=-2004+2)+ 0=0 
=6-2i+2-i =8—3i, 


as desired. Therefore, p = (1 — 2i)z + (2 — 1) is a similarity that has the correct 
effect on our triangle. 


1.4 Classifying Similarities 


Theorem 1.4 tells us that maps z + az+b and z +> az +b) are similarities. 
Marvelously, the converse is also true: all similarities are of one of these two forms. 


Theorem 1.5 (Classification of Similiarities) 
Let ¥ be a similarity. Then there exist complex numbers a,b such that either 
W(z) =az+b forallz € C, or ¥(z) =az +b forallz €C. 


Proof We will begin by considering a simple case: assume that ‘¥(0) = O and 
Y(1) = 1. We will think about what ‘¥(z) can be. First, note that the constant of 
proportionality is 
[EO = SO). [PO 
|1 — 0| |1 — 0| 

hence 'P is an isometry by Theorem 1.3. So, choose any point z € C. We know that 

|¥(z) — ¥O)| = Iz —O| 

[¥(z) — ¥Q)| = Iz - 1. 
How many points w = ‘P(z) are there that satisfy those conditions? Well, let’s 


simplify a little and note that the above two conditions can be written as |w| = |z| 
and |w — 1| = |z — 1]. Furthermore, 


1, 


lw —1/? =(w-D(w—-N=wvod-w-+t1 
= |wl? —2R(w) + 1, 
and by the same logic |z — 1|? = |z|? — 2R(z) + 1. Knowing |w| = |z|, we get that 
R(w) = R(z). Well, |z|? = R(z)? + F(z)? and |w|*? = R(w)* + F(w)*; ergo, 


3(w) = +5 (z). Therefore, either w = KR(z)+ 3(z)i = zorw = R(z)—(z)i = Z. 
We would still like to know that it can’t be that ¥(z) = z for some z, but ‘¥(z) = Z 
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for other z. Suppose that this happens—i.e. there exist (non-real) complex numbers 
Z1, Z2 such that ‘¥(z,) = z; and Y(z2) = z2. Then 


|'¥(z1) — ¥(Z2)| = |z1 — Z2| = |z1 — zal. 
Write z] = x; + yyi and z2 = x2 + yoi, and expand out the above. 
\z1 — Zl" = lai + yni — x2 + ya? 
= (x1 = 2)" + 1 +92)" 
lz1 — zal? = [a1 + yié — x2 — yall? 
= (x1 — x2)? + O1 — py’, 
so the two can be equal only if (y; + y2)? = (y; — y2)?. But 
On + y2)? = yt + 2y1y2 + 3 
(v1 — ya)” = yi — 2y1y2 + 9, 


so equality only holds if yj y2 = —y,y2, which is impossible since y;, y2 4 O by 
assumption. Therefore, we have to conclude that either ‘Y¥(z) = z for all z € C, or 
W(z) =z forallz eC. 

However, we have only proved the case where ‘¥ (0) = 0 and ‘¥(1) = 1. To prove 
the general case, we will show we can actually always reduce to this simple scenario. 
To wit, suppose that ‘¥(0) = wo and ‘¥(1) = wy. Consider the transformation 


p:Co-C 
1 wo 


Zhe Z : 
Ww, — Wo W1 — WO 


We know that it is a similarity by Theorem 1.4, and therefore i) = po YW isa 
similarity. On the other hand, 


¥O) = e(¥O)) = y(wo) 


_ wo wo _g 
~ wi —-wo wi—wo 

wl) = p(¥(D) = (wi) 
So Peal; 


wi—wWo wi—wo 
By our preceding discussion, either (z) = z for all z € C, or w(z) = Z for all 
z €C. Since ¥ = gy! 0 y, and it is easy to check that 
ge '(z) = (w1 — wo)z + wo 


(see Exercise 1.2.5), we see that either '¥(z) = (wy — wo)z + wo for all z, or ¥(z) = 
(w 1 — wo)z + wo for all z. In either case, we have proved what we wanted. oO 
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Fig. 1.6 On the left, the original image. On the right, the image under a similarity. Note that the 
lines are still lines, the circles are still circles, and the angle measures between lines do not change. 


Theorem 1.5 is very useful; now that we understand what similarities are, it is 
easy to prove various properties that they have. 


Theorem 1.6 Let Y be a similarity of C. Then all of the following statements are 
true. 


YY is continuous. 

W has an inverse ¥—!, which is itself a similarity. 

The image of any line under ¥ is a line. 

The image of any circle under ¥ is a circle. 

Given two lines |,, lz that intersect at an angle 0, their image under Y are two 
lines V(Ll,), ¥ (2) that intersect at an angle 0. 


a 


Remark 1.4 A visual illustration of the type of properties that are preserved under 
similarities is provided in Figure 1.6. 


Proof By the composition theorem for complex affine maps and the classification 
of similarities, we know that any similarity is a composition of rotations, dilations, 
translations, and reflections. It is easy to see that all of those transformations are 
continuous, they have inverses that are similarities, they map lines to lines, they 
map circles to circles, and they don’t change the angles between intersecting lines. 
Therefore, compositions of them have all those same properties. Oo 


Remark 1.5 One can prove something stronger than mere continuity here. If you 
think of Y as being a function in two real variables, then it is (real) differentiable, 
and indeed smooth, meaning that its derivative is also real differentiable, and so on 
and so forth. 


Philosophical Principle 


In the proof of this theorem, we have hit on an important idea: to prove that a 
family of mathematical objects has a property, try to decompose those objects into 
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Fig. 1.7 The map z + —Z reverses orientation. 


simple ones. Then, show that those simple objects have that property, and try to 
leverage this to prove that every member of the family has the desired property. 


In light of Theorem 1.6, we say that similarities are angle preserving. We will see 
later that all linear fractional transformations are angle preserving, even though they 
will no longer necessarily map lines to lines. This is a very useful property that we 
will exploit extensively, particularly in later chapters. 

There is a different property that is preserved by some similarities but not others— 
specifically, orientation. Intuitively, we know that mirrors reverse “handedness”; the 
mathematical term for this property is called orientation. Furthermore, it is not hard 
to see from illustrations like Figure 1.5 and 1.7 that transformations of the form 
Z +> az +b reverse orientation. However, formally defining orientation is difficult 
to do in general: it requires knowledge of either differential topology or homology. 
This is far more machinery than we want to introduce. Thankfully, we can do it much 
more simply for our specific case. 


Definition 1.4 Let y : C — C be a transformation that maps circles to circles. 
We say that yp is orientation-preserving if for any circular path C traversed counter- 
clockwise, the image is a circular path y(C) that is also traversed counter-clockwise. 
We say that ~ is orientation-reversing if for any circular path C traversed counter- 
clockwise, the image is a circular path y(C) that is traversed clockwise. 


Remark 1.6 Definition 1.4 can only make sense in the context of maps that preserve 
circles—thankfully, by Theorem 1.6, we know that similarities qualify. We will have 
to revisit this definition in Chapter 2 when we consider maps that do not always send 
circles to circles. 


We can now easily show that similarities split into orientation-preserving and 
orientation-reversing along the expected lines. 
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Theorem 1.7 (Classification of Orientation-Preserving/Reversing Similarities) 
Let ¥ be a similarity of C—exactly one of the following is true. 


1. Y(z) = az +b for some a,b € C, and ¥ is orientation-preserving. 
2. Y(z) = az +b for some a,b € C, and ¥ is orientation-reversing. 


Proof By the classification of similarities, we know that is either of the form 
zr az+b,orzt+> az+ b. In the first case, by the decomposition theorem for 
complex affine maps, ‘ is a composition of a translation, a reflection around the 
origin, and a dilation—it is easy to see that all three of these basics types of trans- 
formations are orientation-preserving and therefore their composition is orientation- 
preserving. In the second case, is also composed with the reflection z +> Z; it is 
easy to see that this reflection is orientation-reversing. However, the composition of 
an orientation-preserving map and an orientation-reversing map is an orientation- 
reversing map. Oo 


To reiterate, we can now describe the affine maps in the following beautiful way: 
they are precisely the orientation-preserving similarities on C! 


p> Example Let y(z) = iz+2. Compute its inverse and confirm that it is a similarity. 
If y(z) = iz +2, then z = ig !(z) +2, hence 


ip'(z)=z—-2 
gl@) =—-iz+2i 
gy '(z) = —iz + 2i = iz —2i, 


which is indeed a similarity. (See also Exercise 1.2.5.) 


1.5 Applications 


We have spent a significant amount of time classifying similarities and showing 
how they relate to linear fractional transformations. It would be good to know that 
all of this effort is actually worth it. We have already seen part of the payoff via 
Theorem 1.6. To add to this, we assemble here a collection of various ways that our 
machinery can be used to give short proofs of classical results in Euclidean geometry. 


Theorem 1.8 Let Aj, Az be two triangles. They are similar if and only if there exists 
a similarity ¥ such that '¥(A,) = Ao. 


Proof Saying that A; and A2 are similar is the same as saying that A, has vertices 
v1, 02, v3 and A> has vertices w1, w2, w3 such that 
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dguctid(Y1,02) _ ABuctid (wi, w2) 
dpuctid(01,03) — deuctia(wW1, w3)’ 
dguctid(02, 03) _ AEuctid(w2, W3) 
dpuctid(02, 01) deuctia(wW2, W1)’ 
dguctid(03, 01) _ ABuctid (3, W1) 
dpuctia(03,02) — dpuctid(w3, w2)’ 


If there exists a similarity ‘Y such that ¥(A;) = Ao, then the above will be 
satis fied—all of these relations come from the defining property of a similarity! So, 
the only difficulty is proving that if A; and A? are similar then that had to have come 
from some similarity taking one to the other. To prove this, we will first consider a 
very basic case: suppose that 0} = w; = 0 and v2 = w2 = 1. If this is so, then we 
must have 


|o1 — v2| — |w1 — wo 
|v) — 03] Jw — w3]” 
loz — 03| _ |w2 — w3| 
log — v1] [we — wy’ 
whence 
O-1] _ |O-1| 
|O—03|  |O—w3|’ 
|l—v3| — |1—ws3| 
[1-0] = J1 —O]’ 
and so 
|o3| = |ws| 


[eg 1| = |e = 1], 


We saw previously in the course of the proof of the classification of similarities that 
such equations have only two solutions: either 03 = w3 or v3 = 3. Therefore, we 
can take either ‘¥(z) = z or ¥(z) = z and have ‘¥(A,) = Az, as desired. 

How do we reduce to this basic case, though? The observation is the following: 
if we apply a similarity to A; or Ag, then the new triangles will still be similar. 
Furthermore, if we can find a similarity between these two new triangles, then we 
can compose it with the similarities used to move A; and Ap? into their special 
position to get a new similarity ‘¥ such that ¥(A1) = Ao. So, all we need to do is 
to find a similarity that will move v; +> 0, v2 > 1. This is easy: use 


1 
p(z) = ——— (z - 0}). 
b2 — VI 


Obtaining an analogous similarity that sends w; +> 0, w2 > 1, we are done. oO 


Corollary 1.1 Let Aj, Ao be two triangles. They are congruent if and only if there 
exists an isometry ‘Y such that ‘¥(A,) = Az. 
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Fig. 1.8 A pair of reflections can be composed to give a rotation around the origin. 


Remark 1.7 Generically, the isometry such that ¥(A,) = Az is unique—however, 
if A; is somehow symmetric, then there can be multiple different isometries with the 
same properties. Indeed, in the next section, we shall consider examples of isometries 
that move a polygon back onto itself—the identity map will always do this, but there 
may well be other examples. 


Proof By Theorem 1.8, we know if Ay = Az, we can find a similarity Y such that 
Y(A,) = Y(A2). However, since Ay = Az, the constant of proportionality of ‘V 
must be 1, so it is an isometry. oO 


Theorem 1.9 Every isometry of C can be written as a composition of reflections. 


Proof | leave the proof to the reader. Figure 1.8 gives a hint of how to do it for 
rotations. (See Exercise 1.2.9.) oO 


Theorem 1.10 Jf Y is an orientation-preserving isometry of C, then either ¥ is a 
translation, or it is a rotation around some point. 


Proof Since ‘Y is orientation-preserving, by the classification of orientation- 
preserving and orientation-reversing isometries, we know that ¥(z) = az +) 
for some complex numbers a, b. Since V is an isometry, we know that |a| = 1. 
(See Exercise 1.2.4.) If a = 1, Y is a translation. Otherwise, write a = el _] 
claim that Y is a rotation by @ around some point w. This is so if V is of the form 
P(z) = ez 4 (1- e?) w for some w € C. (See Exercise 1.2.2.) But since e!? £ 1, 
we see that if we simply take 
b 
ae) 


then we have shown that ¥ can indeed be put in the desired form. oO 
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Fig. 1.9 A glide reflection. 


Theorem 1.11 /f Y is an orientation-reversing isometry of C, then either ¥ is a 
reflection across some line, or is a glide reflection (that is, a reflection across 
some line together with a translation along the direction of this line, such as what is 
illustrated in Figure 1.9). 


Proof { leave the proof to the reader. (See Exercise 1.2.10.) oO 


Theorem 1.12 Any isometry of C fixes either no points, one point, a line, or the 
entire plane. 


Proof The identity map z + z fixes the entire plane. Assume that our isometry is 
not the identity map. There aren’t many other options: 


1. Translations and glide reflections fix no points. 
2. Rotations fix one point. 
3. Reflections fix a line. 


This enumerates all possibilities. oO 


> Example Determine the set of points fixed by the similarity y(z) = e!7/°z. 

The set of fixed points is the collection of z € C satisfying z = e'*/>z. Multiplying 
by z on both sides, we get <7 = |z|7e’"/>. Write z = re!’. Then this becomes 
r2e9 — ye'™/5, Ergo, the set of fixed points is the line of points of the form 
z=re'™/!9 for some r € R. 
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1.6 ALittle Bit of Group Theory 


The focus of this chapter can be summarized as trying to understand Euclidean geom- 
etry by studying transformations on it. For example, we might study the collection of 
isometries: this gives the usual notion of congruence that we are used to. We might 
study the collection of similarities: this gives the usual notion of similarity that comes 
up extensively in trigonometry. In short, we have had the following guiding thought. 


Philosophical Principle 


Rather than studying a type of geometry directly, study the collection of transfor- 
mations that preserve its basic properties. 


This philosophy is very prominent in modern mathematics. In fact, we can go 
further and try to define a geometry by starting with a collection of transformations 
and seeing what sort of properties they preserve—this is more or less precisely how 
we will introduce inversive geometry, and later hyperbolic geometry. Before I end 
this chapter, I want to briefly develop this philosophical idea further and specify what 
types of collections of transformations we are interested in—that is, I want to finish 
with a short introduction to groups. 


Definition 1.5 A group (G, x, +) is a set G together with a binary operation * : 
G x G > G (which we usually call the group operation or group multiplication) 
satisfying the following properties. 


1. For all a,b,c € G, (a *b) *c = a * (b * c)—that is, the multiplication x is 
associative. 

2. There exists an element  € G such that for alla € G,a*t1 = 1*a = a—that 
is, there is an identity. 

3. For every element a € G, there exists an element b € G such thataxb = bxa = 
t—that is, every element a has an inverse. 


A group is called abelian? if additionally for all a, b € G, ab = ba. 
Remark 1.8 Some authors include a “closure” axiom that states that for alla, b € G, 
a*b € G. For us, this is packaged into the definition of a binary operation—after 


all, we define the co-domain of « to be G. 


Remark 1.9 It is customary to denote the inverse of an element b by b7!. This is 
justified by the fact that any element has only one inverse. (See Exercise 1.4.3.) 


If it is clear from context what the group operation * is, we will simply write ab 
to mean a « b. We will also often refer to G itself as a group—so we might refer, for 


? Why on Earth are groups whose multiplication is commutative called abelian? They are named in 
honor of Niels Henrik Abel, one of the very first group theorists. 
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+ odd 


+ even rT ttsSCd + even 


+ odd 


Fig. 1.10 A diagram illustrating the arithmetic of even and odd numbers. 


instance, to “the group of similarities Sim(C).” This is technically abuse of notation, 
but it is far more convenient and I have never seen it be confusing in practice. 

You might think that you haven’t seen groups before, but I assure you that you 
have: you just haven’t seen them under that name. Let me provide a few examples. 


1. The set of complex numbers C is an abelian group if we take « to be addition and 
t = 0. Indeed, addition of complex numbers is associative, 0 is an identity, and 
every complex number z has an additive inverse —z. 

2. The set of real numbers R, the set of rational numbers Q, and the set of integers Z 
are all abelian groups if we take * to be addition and = 0, for the same reasons 
as above. 

3. The set of non-zero complex numbers C” is an abelian group if we take « to be 
multiplication and z = 1. Indeed, multiplication of complex numbers is associa- 
tive, | is an identity, and every non-zero complex number z has a multiplicative 
inverse z~!. 

4. The set of non-zero real numbers R*, the set of all positive real numbers R’, 
the set of all non-zero rational numbers Q”, and the set of all positive rational 
numbers Q* are all abelian groups if we take * to be multiplication and z = 1, 
for the same reasons as above. 

5. The set {even, odd} is an abelian group if we take * to be addition with the usual 
rules that 


even + even = even, 
even + odd = odd, 
odd + even = odd, 
odd + odd = even, 
and we take 1 = even. Indeed, one can check that this addition is associative, 


“even” is an identity, and each element has an inverse (see Exercise 1.4.1). 
Figure 1.10 gives a visual guide to understanding this group. 


All of the above are very important and worthy groups—however, there are a few 
examples that are more relevant to us. 


Theorem 1.13 Define Sim(C) to be the collection of similarities on C. Then Sim(C) 
is anon-abelian group if we take the operation to be composition and the identity to 
be i(z) =z. 
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Proof Let’s attack this piece by piece. First, we confirm that function composition 
o is a binary operation on Sim(C)—that is, if we compose two similarities, we get 
another similarity. We know that this is true by Lemma 1.5. Second, we show that 
the group operation is associative. But of course it is: if ~1, ¢2, ~3 € Sim(C), then 
(y1 © (Y2 0 93) %) = vily2(y3@))) 
= ((¥1 © 2) 0 ¥3) (Z), 
whence v1 0 (y2 0 3) = (1 0 Y2) 0 3, and so 0 is associative. For all y € Sim(C), 


(y 0 )(%) = UR) = v@) 

= UZ) = © v)@), 
whence pot =1t0y = y, and so v is the identity. We need to show that for every 
y € Sim(C) there exists ~ € Sim(C) such that y ow = wo y = u—we know that 
this is true by Theorem 1.6. Finally, why is this a non-abelian group? Consider the 
transformations 

diz) =z4+1 

2(z) = iz. 


Both of these are elements in Sim(C), but 


($1 0 b2)(z) = iz +1 
($2 0 @1)(z) = iz +1, 


which are different. oO 


We saw that C and C* both contain smaller groups that are interesting in their 
own right—so does Sim(C). 


Theorem 1.14 All of the following are non-abelian groups if we take the operation 
to be composition and the identity to be u(z) = z. 


1. Sim°(C): the collection of all orientation-preserving similarities of C. 
2. Isom(C): the collection of all isometries of C. 
3. Isom®(C): the collection of all orientation-preserving isometries of C. 


Proof \leave the proof that they are groups to the reader. (See Exercise 1.4.5.) To see 
that they are non-abelian, it is sufficient to note that the two transformations ¢1, ¢2 
that we used to prove that Sim(C) is non-abelian are also elements of all of these 
groups. Oo 


What gives? Why is every interesting collection of transformations a group? If you 
think about it, this makes perfect sense. First of all, our collection had better be closed 
under composition—at worst, if it is not, then we enlarge it until it is. Composition of 
functions is always associative, so we get that property for free. Whatever collection 


22 1 Euclidean Isometries and Similarities 


of transformations we have, the identity transformation z +> z that just doesn’t do 
anything to our underlying space is always an option. The only requirement that is 
at all restrictive is that every transformation must have an inverse. However, in the 
context of geometrical transformations, this requirement will typically be satisfied 
because whatever transform we do, we usually can simply undo it, and that will be 
our desired inverse. This simple observation encapsulates why group theory is central 
to much of modern geometry, and allows us to refine the philosophical statement that 
we voiced previously. 


Philosophical Principle 


To study a geometry, determine some invariants (such as length or angles) that 
characterize that geometry. Then, study groups of transformations of this geometry 
that preserve these invariants (such as isometries or similarities).° 


Moreover, sometimes you will want to consider geometries that are refinements of 
each other—we know, for instance, that studying Euclidean geometry via congruence 
is a refinement of studying it up to similarity. In these cases, we look for a subset of 
the original group of transformations; of course, that subset should itself be a group 
via the same operation as the original. In other words, it should be a subgroup. 


Definition 1.6 Let (G, *, 1) be a group. A subgroup H of G is a non-empty subset 
H C G such that for all g,h ¢ H, gh © Handh7! € H. 


Remark 1.10 It isn’t hard to prove that H is a subgroup if and only if (H, x, v) is a 
group, thus justifying the name. (See Exercise 1.4.4.) 


So, for example, we have shown that Sim®(C), Isom(C), and Isom? (C) are all 
subgroups of Sim(C). These are all subgroups that have infinitely many elements— 
they fit into the portion of modern geometry known as Lie theory. Before we finish, 
I want to give an example of a subgroup of Sim(C) that is finite—such examples 
are also interesting, but typically appear in slightly different areas of mathematics, 
such as geometric group theory. To be concrete, we are going to define the isometry 
group of the square. 


Definition 1.7 Let o denote the square with vertices +1, +i. Define Isom(¢) to be 
the set of y € Isom(C) such that y(o) = o. 


Lemma 1.6 Jsom(©) is a subgroup of Isom(C). 
Proof First, note that if ~1, y2 € Isom(), then yw; (y2(o)) = ©, hence yy, o v2 € 


Isom(o). Secondly, y € Isom(o), then certainly there exists y~! € Isom(o), and 
since y(o) = 9, we see that y~!(o) = ©, hence y~! € Isom(o). Oo 


3 This philosophy is known as the Erlangen program; it was originally proposed by Felix Klein in 
1872 [8]. 
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Fig. 1.11 A diagram illustrating all of the various isometries of the square—the purple arrows 
correspond to rotations, while the green arrows correspond to reflections. 


To figure out what transformations this group consists of, we will first think 
about a simpler group; namely, Isom°(o), the collection of orientation-preserving 
transformations that send © +» ©. This set can also be understood as Isom(¢) N 
Isom°(C), and it is easily seen that it is a group. 


Lemma 1.7 There are only four elements in Isom? (0): z > z, z+ iz, Z —Z 
Zhe —iz. 


Proof Notice that 0 is the intersection of the diagonals of >—therefore, its image 
under any isometry y must be the intersection of the diagonals of the square (0). 
However, (0) = ¢ if o € Isom?(o), so y(0) = 0. Since vis orientation-preserving, 
it must be a rotation around the point 0. Any such rotation will be totally determined 
by where it sends |. But since | is a vertex of the square, its image must be one of 
the vertices of the square. This gives precisely the four options listed. oO 


The rest is easy. 


Theorem 1.15 /som(¢) is a group consisting of the following eight elements. 


ZeZZRIZZH -ZZb iz 
ZeZZRIZZH -ZZbH - IZ 


Remark 1.1] Figure 1.11 depicts the structure of this group. 


Proof We already know that Isom(©) is a group and it is easy to see that ~(z) = Z 
is an orientation-reversing transformation in Isom(¢). For any y € Isom(¢) that is 
orientation-reversing, we see that y 0 w is orientation-preserving—that is, it is inside 
Isom°(). However, we already know everything inside Isom°(¢)! Thus, there are 
only the eight given choices of isometries. Oo 


While a nice result, there is something a little artificial about it in that we have 
only worked out the isometry group of this particular square. However, one can check 
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that the isometry group of any other square “looks the same” in some sense. (See 
also Exercise 1.4.11.) In this broader context, the group Isom(¢) is better known 
as the dihedral group of order 8, or Dg. It is often one of the first examples of a 
group depicted in any course on the subject due to being easy to visualize yet already 
demonstrating some of the complexities of the subject. (See also Exercise 1.4.12.) 
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Problems 
1.1 COMPUTATIONAL EXERCISES 


1. For each of the following, compute the image under y : C > C. 


a) Line y = 3x — 1, pz) = 24+1)z-1. 
b) Line x = 4, y(z) = iz +i. 

c) Circle |z| = 2, y(z) = (1 — 3i)z. 

d) Circle |z — 2] = 1, p(z) = 1 — az +2. 


2. For each of the following, given y1, y2 : C > C, compute 1 0 Y2 and y2 oy}. 
Are they generally the same or different? 


a) yi (z) = Re yo(z) = iz. b) yi(z) = 3 + 4i)z, yo(z) = A - 
= Le ee i)z. 

ee ere tele. ay Ole seal. 

©) pilZ) = iz, p2(@) = z+. f) gi) =Z, yo) =z +i. 

g) viz) =Z, poz) = 72 4+2. h) gi(z) =Z, yo(z) = Ut + 2)z. 


3. Find a,b € C such that z+ az+ borzt> az +b is the desired similarity. 


a) A translation that moves 2 — 3i KH 1+ i. 

b) A rotation around the origin that moves 3 + Sih (4+ i)/2. 

c) A rotation around 2 + /3 +i by 7/6 radians. 

d) A reflection through the line y = 1. 

e) A reflection through the line y = x. 

f) A reflection through the line y = 3x — 1. 

g) A glide reflection through the line y = x moving OF 1 +7. 

h) An orientation-preserving similarity taking 1 + 7 — 3i and —2+i + 6i. 


1.2 PROOFS 


1. a) Prove de Moivre’s theorem that cos(nx)+i sin(nx) = (cos(x) + i sin(x))”. 
(Hint: Use Euler’s formula.) 
b) Set = 2 in the above, and expand the right-hand side. Use this to compute 
cos(2x) and sin(2x) in terms of sin(x) and cos(x). 
2. Our goal for this exercise is to find an explicit formula (of the form z + az+b) 
describing a rotation of # radians around a fixed point w. 


a) Let be the desired rotation. If we let vy be a translation taking w to 0 (so 
that y~! is a translation taking 0 to w), then what is the isometry y~!oP ow? 
(Hint: it is a rotation.) 

b) Use your answer to part a) to prove that ¥(z) = e!?z + (1 a e®) w. 
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3. a) Let ‘P), Po be similarities. Show that ‘P; o 2 is a similarity. 
b) Why is 
dpuciia (P1 (P2(z1)), P12 (z2)) 
dpuctia (2 (21), P2(z2)) 


equal to the constant of proportionality of ‘¥? 
c) Why is 


dpuctia Pi (¥2(Z1)), PiCP2(Z2)))  dBuctia P2(z1), Po(z2)) 
dguctia P2(z1), ‘Y2(z2) dguclid (Z1, 22) 
equal to the constant of proportionality of ‘¥; o ‘2? 
d) Why does Lemma 1.5 follow from parts a)- c)? 
4. Prove that the constant of proportionality of the similarity y(z) = az + bis |a|. 
5. a) Given y(z) = az +b, compute the inverse y~!(z). 
b) Given (z) = az +b, compute the inverse y~!(z). 
6. Let / be a line passing through two points w 1, w2. Prove that the reflection 
through that line has the form 


aaa en) 
Ww, — W2 W| — W2 
(Hint: You know that p(z) = az +b for some a,b € C and that p(w,) = wi, 
(:p(w2) = w2. Use this to solve for a and b.) 
7. Let / be a line and let w be the closest point on / to the origin. Our goal is to 
prove that the reflection through / has the form 


WwW 
p(z) =—-—7+2w. 
WwW 


a) Prove this assuming that w > 0. 

b) Write w = re’? and p = e!®z. If y is the reflection through /, describe what 
sort of similarity ~~! 0 po wis. 

c) Use your answers to the previous parts to prove the result for the general 
case. 


8. Let / be a line passing through two points w 1, w2. Prove that the glide reflection 
through the line moving w; +> wz has the form 


—w2_ |wi|? — 2jw2 + |w|? 
Z+ ; 


Ww) 
2) ———— 
W1 — wW2 W2 — Wi 
(Hint: Use the result of Exercise 1.2.6.) 
9. Prove Theorem 1.9. (Hint: you only need to prove that any translation and any 
rotation around the origin are compositions of reflections.) 
10. Prove Theorem 1.11. 
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1.3. PROOFS (Calculus) 


1. Our goal for this exercise is to give a semi-rigorous proof of Euler’s theorem. 
(For a fully rigorous proof we would need to properly define what we mean by 
e* for complex inputs. There are various ways to do this—one approach is to say 
that f(z) = e* is the unique solution to the differential equation y’ = y with 
initial condition y(0) = 1, but this requires defining the complex derivative.) 


a) Compute the Taylor series of e* centered at x = 0. 

b) Substituting ix for x, compute the Taylor series of e’* centered at x = 0. 

c) Compute the Taylor series of cos(x) and sin(x) centered at x = 0. 

d) Using the results of the previous parts, show that the Taylor series of e!* 
matches the Taylor series of cos(x) + i sin(x). 

e) Use Taylor’s Remainder Theorem to prove that e*, sin(x), cos(x) are all 
equal to their Taylor series for all x. Conclude that e’* = cos(x) +i sin(x) 
for all x. 


1.4 PROOFS (Group Theory) 


1. Let G = {even, odd} and define 


even + even = even, 
even + odd = odd, 

odd + even = odd, 

odd + odd = even. 


We will prove that G is a group. 


a) Prove that + is associative. (Hint: There are only finitely many choices for 
a,b,cina+(b+c) = (a+b) +c to check.) 

b) Prove that “even” is an identity. (Hint: There are only finitely many choices 
for a ina +even = even + a = 4 to check.) 

c) Prove that every element a has an inverse. 


2. Let G be a group with an identity v. Suppose that there is another element e € G 
with the property that g xe =e* g = g forallg €G. 


a) Why isi *e=1? 
b) Why isuxe =e? 
c) Why does this prove that the inverse of a group is unique? 


3. Let G be a group with an inverse u. Let g be an element of G, and suppose that 
there are two elements h,k € Gsuchthat g*h =h*xg =randgxk =kxg =u. 
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a) Whyisk*x g*xh=k? 
b) Whyisk*x gxh=k? 
c) Why does this prove that the inverse of any element is unique? 


Let (G, *, 4) be a group. Prove that H is a subgroup if and only if (H, *, v) isa 
group. 

Prove Theorem 1.14. (Hint: you may want to go through Exercise 1.4.4 first.) 
Let G, H be groups with identities 1g and ty, and operations « and o. An 
isomorphism between G and H is a bijective map y : G — AH such that for 
alla,b € G, p(a*b) = v(a) o y(b). Intuitively, we say that an isomorphism 
“preserves multiplication.” It can also be thought about as a map that renames 
elements, but keeps the underlying arithmetic the same. 


a) Prove that y(tg) = eq. (Hint: use the result of Exercise 1.4.2.) 
b) Prove that y(a~!) = y(a)7!. (Hint: use the result of Exercise 1.4.3.) 
c) Prove that y~! is also a group isomorphism. 


Prove that the natural logarithm In is an isomorphism between Rt (considered 
as a group under multiplication) and R (considered as a group under addition). 
What is its inverse? 

For some fixed symbol g, let G be the set of all symbols g” where n € Z. Give 
G a multiplication by g” * g” = g™*™", 


a) Prove that G is a group. 
b) Prove that there is an isomorphism between Z and G. 


For some two fixed collections of symbols g1, g2,... gx, define the free group 
(21, 82,---8k) to be the set of all sequences written in terms of symbols 
gis... 8, where n1,...me € Z (e.g. gg4g,° is an element of the free 
group), with the rule that g/g? = go for any i in any such sequence (e.g. 
Z1 gh gi 86 = £1 8} g6). For convenience, rather than writing the empty sequence 
consisting of no symbols as a blank space, we instead write it as v. Define a multi- 
plication on the free group via concatenation—that is, given any two sequences, 
we can just write one after the other (e.g. 8387 * £187 = £38) |87). 


a) Prove that (91, g2,... gx) iS a group. 
b) Prove that there exist elements x, y € (g1, g2,... gx) Suchthatx* y A y*xx 
as long ask > 1. 


For some two fixed collection of symbols gj, g2,...g%, and some fixed col- 
lection of elements X1, X2...X, € (gi,...gx) define the quotient group 
(21, 82,--- 8k|X1, X2...X;,) to be the elements of (g1,...g%) with the addi- 
tional rules that X; = X2 =... = X; =. For example, in (g, highg'h7!), 
ghg'h-! = 1, s0 gh = hg—that is, we can freely exchange the order of g and 
h in this quotient group. 
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a) Prove that (91, g2,...2%|X1, X2,...X;,) is a group. 
b) Prove that (g|g) has just two elements in it. Does it remind you of any other 
group you have seen? Can you find an isomorphism? 


11. a) Let D be any square in the Euclidean plane. Define Isom(D) to be the 
collection of isometries yy with the property that y(D) = D. Prove that 
Isom(D) is a group. 

b) Let D;, D2 be any two squares in the Euclidean plane. Since they are similar, 
there exists a similarity ~ : C — C such that 7(D,) = D). Prove that if 
y € Isom(D}), then wo yoy! € Isom(D2). 
c) Prove that 
Y : Isom(D,) > Isom(D2) 
gre popoy! 
is an isomorphism. 
d) Why can we now conclude that Isom(D) has precisely 8 elements, regardless 
of the choice of square D? 
12. The result of Exercise 1.4.11 tells us that the underlying multiplication of 
Isom(D) does not really change regardless of which square D is—all that 


changes is how we write down the elements of the group. Our goal here is to 
give a standard way to write down this group that makes this property explicit. 


a) Define R(z) = iz and L(z) = Z. Prove that R* =v, L? =v, and (Lo R)* = 
u. (Here, as for all groups, a” should be understood as a multiplied by itself 
n times—trecall that in this case, multiplication means composition.) 
b) Define Dg = (L, R|R*, i. LRLR) (see Exercise 1.4.10). Prove that 
Dg — Isom(o) 
LeL 
Rt>R 


defines an isomorphism. 


13. Define a set M consisting of all 3 x 3 real matrices of the form 


a—bx abx 
ba y]or{b-ay 
00 1 00 1 


for some a, b, x, y such that a +b? = 1, 


a) Prove that M is a group if we take the operation to be matrix multiplication. 
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b) Prove that 
M :Isom(C) > M 


R(a) —I(a) R(d) 
aztbeR | 3@ R@ FJ) 


0 0 1 

Ria) Jia) Rd) 

azt+tbwrh | S@ -R@ I) 
0 0 1 


is a group isomorphism. 


® 


Check for 
updates 


Inversive Geometry 


In which the true faces of our main 
characters are revealed, and we 
consider their actions. 


Having understood the simplest linear fractional transformations, our next goal 
should be to understand maps 


az+b 

cz +d 

as functions on the complex plane. Of course, this statement isn’t quite right: if 
c #0, such maps cannot possibly be functions on the complex plane. Indeed, since 
c(—d/c) + d = 0, g(—d/c) can’t be defined in the usual way. This is a problem 
that can be fixed, but it will require introducing a point at infinity. This sentence also 
contains a more subtle inaccuracy: we won’t actually consider all functions of this 
form, because some of them are very boring. For instance, suppose that d 4 0 and 
a = bc/d. Then 


g(zZ) = 


az+b %z+b  biczt+d)_ b 

cztd  cztd  d(cz+d) d 
as long as cz-+d # 0. Constant functions are not interesting and so we will exclude 
them. What we shall discover is that as long as we require that ad — bc # 0, g can 
be defined and it will be a non-constant function on the complex plane augmented 
with a point at infinity. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 31 
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Sx+ 


Fig. 2.1 On the left, the graph of x b> 


plotted over the real numbers. On the right, a small 


2x+1 


z-5i 


piece of the graph of z > | #5 


2 
| over the complex numbers. 


2.1 The Extended Euclidean Plane 


Before we begin discussing what a point at infinity is, let’s first think about what we 

would like to define g(—d/c) to be. It is useful to consider what g(z) tends toward 

as z approaches —d/c. For illustrative purposes, let us first consider the case where 

a,b,c,d are all real numbers and we look at what happens as x approaches —d/c 

from the left and from the right. From basic calculus, we know that if c # O and 
a(—d/c) +b # 0 then 

ax +b . ax +b 

im = =£00 lim 

x>—d/ct cx +d x>-d/e~ cx +d 


where the sign of the limit depends on the signs of c and a(—d/c) + b. Furthermore, 
the signs of the two limits are always opposite to one another. In either case, as x 
gets very close to —d/c, |(ax + b)/(cx +d)| will either get larger and larger without 
bound. These asymptotes are shown in Figure 2.1. 

This observation suggests that we may want to define g(—d/c) = oo or —co— 
except, which one should it be? After all, which one it depends on the direction 
we approach from. This gets even more complicated when we generalize to the 
case where a, b, c, d might be complex numbers and we are looking at approaching 
from any conceivable direction in the complex plane. Thankfully, there are no such 
ambiguities if we consider the norm: if a, b,c, d are complex numbers, c 4 0, and 
a(—d/c) +b #0, then 


“etl - 


im 
z>—d/e|cz +d 
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(See Exercise 2.3.1.) This suggests a simple solution to our problem: augment the 
complex plane with a single point, which we call the point at infinity. 


Definition 2.1 The extended complex plane, also known as the Riemann sphere, also 
known as the complex projective line, is defined as 


CP! =CU {oo}, 


where oo is an extra formal symbol. 


Remark 2.1 This definition is not the only possible way to add points at infinity to C 
or, equivalently, R7. In fact, it isn’t even the only widely used construction: another 
very common one is RP, the real projective plane, which adds an entire line at 
infinity. While this construction is important, we will not make use of it. 


The term “extended complex plane” is unlikely to be surprising, but the terms 
“Riemann sphere” and “complex projective line” probably are—after all, it certainly 
doesn’t look like there is a sphere or a line here. How spheres show up will become 
apparent momentarily; the reason why this space is a “line” in any sense comes from 
projective geometry and requires some familiarity with abstract algebra. I refer the 
interested reader to Hartshorne’s Foundations of Projective Geometry [6]. 

In some sense, this definition is incomplete: it only describes what CP! is like 
as a set. It does not give any indication about what other structure CP! might have. 
Is it a group? Does it have some sort of distance function defined on it? The short 
answer is that neither of those two structures applies (at least, not in natural ways). 
It does have natural structure as a topological space, a manifold, and an algebraic 
variety, but, unfortunately, all of those are beyond the scope of this book. Again, I 
refer the reader to Hartshorne [6]. 

While we define CP! in an altogether formal way—it is the set of complex num- 
bers with a single new point added—the intuition about what that point represents 
is fairly clear: it is supposed to be a point that is infinitely far away from the ori- 
gin. There are various ways to make this precise, but we will only note that this 
construction can be understood in terms of stereographic projection. 


Definition 2.2 Let S? be the unit sphere in R? centered at the origin. Define the north 
pole py = (0,0, 1). For every point p € S* that is not the north pole, there is a 
unique line /(p) through p and py, and this line has a unique intersection Q(p) with 
the xy-plane. This point has a natural interpretation as a complex number, which we 
call S(p), the stereographic projection of p onto C. We call the map 


stereo : S*\{py} > C 
pr S(p) 


stereographic projection. 


Figure 2.2 gives an example of how this map works; Figure 2.3 shows a slice of 
the same. It is geometrically clear that the closer that the point p gets to py, the 
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Fig.2.2 The stereographic projection of a curve on the sphere to a curve on the complex plane. 


further away stereo(p) will be from the origin. This suggests an obvious extension 
of the map stereo, as follows: 


stereo : S* > CP! 
s : 
= (p) ifp# PN 
ee) otherwise. 


By slight abuse of notation, we shall also call this map stereographic projection. It 
has many nice properties, but what we shall care about primarily is that it is bijective 
and can be given a wonderfully simple algebraic description. 


Theorem 2.1 The map stereo is bijective. Indeed, 

stereo : SX > CP! 
HY ite #1 
co §=6ifz=1, 


iar | 


and its inverse is 


2RG@) 25) eft) , 
al (Fe TH? (Pat) FAO 
(0, 0, 1) Biot 


Proof Choose a point (x, y, z) on the sphere. If z = 1, then this point must (0, 0, 1), 
which we know stereographic projection sends to oo. Otherwise, we can determine 
what point in C it will be sent to as follows: the line through py = (0,0, 1) and 
(x, y, Z) is the set of points of the form 


(0,0, Id —H+(, y,z)t = (xt, yt, (zg — 1I)t +1) 
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Fig. 2.3 Stereographic projection restricted to the y = 0 plane. 


for t € R, and the intersection of this line with the x y-plane happens precisely when 
the z-component is zero—that is, when (z— 1)t+1 = 0, ort = 1/(1—z). Therefore, 
S((x, y,z)) = ty , as claimed. Next, we check that the inverse is as claimed. For 


any (x,y,z) € Atpw), let 


x +1y 
1—-z 


and note that 


stereo 


2R(w) 25(w) |wl2—-1 
T+ wl?’ 1+ |wl2’ w+ 


2x(1 — zs 2y(1 —z) sey —T=2y 
a z4 y2’ el z)2 } x2 4 y2? x2 y2 + (1 z) 
“e=9 2y(1 — z) x? + y?-142z-27 
zur tyt+e2? 12+ x27 +y2 422" 24 y2 +2241 -2z 
a —z) 2y(1—z) 27-22? 
= (x, y, z) 
20 —z)’ 20d—z)’ 2-2z 


= “(E 
> 24 y? 
ae aa! 
x2 oy i x24y2 7? x2+y2 l 
(1-z)? (=z)? (=z)? 


whence (stereo | o stereo)(p) = p forall p € S*. I leave proving that composing 
in the other order also gives the identity as an exercise for the reader. (See Exercise 
2ano.) Oo 


It now makes perfect sense to define y(z) = (az + b)/(cz + d) as a function 
that returns points in the extended complex plane CP!. However, we would like 
the domain to be the same as the codomain so that 9g is really a transformation of a 
particular space. We can accomplish this without too much difficulty. All we need to 


do is to define g(oo) and once again calculus comes to the rescue—if ad — bc £0 
and c # 0, then 
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az+b_ a 


im ee 
Z>CO CZ + d Cc 
(See Exercise 2.3.2.) In words, what we are saying is that as z gets farther and farther 
away from the origin (irrespective of which direction), (az +b)/(cz +d) gets closer 
and closer to a/c. This makes perfect intuitive sense: if |z| is large, thenaz+b © az 
and cz+d © cz, and so (az +b)/(cz+d) © (az)/(cz) = a/c. This, finally, allows 
us to make an unambiguous definition of linear fractional transformations. 


Definition 2.3 A linear fractional transformation on C P! is a map of the form 
g:CP!+CpP! 


¢ ifz@=o,c 40 


zee joo ifz=o0,c=Oorifz=—4,c £0 
azt+b : 
aad otherwise, 


for some a, b,c,d € C such that ad — bc # 0. 


Remark 2.2 If ad — bc = 0, then either g will be undefined, or it will be a constant 
function. (See Exercise 2.2.1.) 


One possible objection to this definition is that it is not very elegant: it requires 
splitting into four different cases. There are various ways to rectify this. One possible 
solution is to write 


. awt+b 
zh lim ; 
w>zcwt+d 


with the understanding that a limit that fails to exist should be understood as returning 
oo. Another option is to phrase everything in terms of projective geometry, but we 
will not pursue this notion in this text. 


> Example Let y(z) = (3 + z)/(z — 1). Find g(o0) and find the z € CP! such 
that p(z) = oo. How many such z are there? Would the answer be different for a 
different linear fractional transformation 0"? 

Since 


i 
z>00 z — | 
we see that y(co) = 1. On the other hand, per the definition of the linear fractional 
transformation, g(z) = oo only if the denominator is 0—that is, if z — 1 = 0. 
Therefore, z = 1 is the unique element such that y(z) = oo. It is enough to see 
that this same result will hold true for any linear fractional transformation: either 
c = 0, and y(z) = oo if and only if z = oo, orc 4 0, and g(z) = ov if and only if 
cz +d = 0. Ineither case, there is always exactly one point for which this happens. 
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2.2 A Little Bit More Group Theory 
Linear fractional transformations have many beautiful properties. The first among 


these that we will prove is that they form a group. To be a bit more precise, it is 
convenient to give a definition. 


Definition 2.4 The set of all linear fractional transformations g : CP! > CP! 
shall be denoted Méb°(2). 


Remark 2.3 The notation M6b°(2) to denote linear fractional transformations is 
liable to raise some eyebrows. I promise that there is a good reason for it, which will 
be explained later in this chapter. 


What we are going to prove is that Méb°(2), together with function composition 
o as the operation, is a group. First, we will need a few lemmas. 


Lemma 2.1 Let 91, 92 € M6b° (2). Then 91 0 y2 € M6b* (2). 


Proof This is a simple algebra exercise. Write 


Bes aw+b 
= him 
PIX w>Zcw+ d 
and 
aw+b' 
oD aa te a OE 
Then 
. ag2(w) +b 
rs) Z= lim earn fr a 
(91 p2)( ) wz co2(w) +d 
_ lj a lim ws w oete 7 b 
ae ali awh 
CHM w/w Mpg? + 


However, all of our functions are continuous, so we may pull out the limit as w’ > w. 
This allows us to simplify to 


(91 0 g2)(@) = lim lim —7 


ca a(a’wt+b’)+b(c'wt+d’) 
w>z c(v’wt+b’)+d(cwt+d) 
i (aa’ + bc')w + (ab’ + bd’) 
wz (a'c + c'd)w + (b’'c + dd’) 
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We see that the expression at the end is of the right form once we check that 
(aa' + bc')(b'c + dd’) — (ab’ + bd’)(a'c + c'd) 
= aa'b'c + aa'dd' + bb'cc’ + be'dd' 
aa'b'c — ab'c'd — a'bcd' — bc'dd' 
= aa'dd' + bb'cc’ — ab'c'd — a'bcd' 
= (ad — bc)(a'd' — b'c') £0, 


whence it is a linear fractional transformation. oO 


The trick of rearranging the limit whenever we encounter compositions of linear 
fractional transformations will always work, but I maintain that it is something that 
only needs to be seen once. As such, henceforth, we shall leave off writing the limit 
entirely. Instead, by slight abuse of notation, we will simply write 

g:CP!—+CpP! 
az+b 
b> | 
cz +d 
without worrying about the exceptional points. 


Lemma 2.2 For any 9 € Méb°(2), there exists p~' € M6b° (2) with the property 
that o(g~!(z)) = o'(g(2)) = zforallz € CP}. 


Proof Write 
az+b 
= td 
dw —b 
—| _ 
Sea) ore eer 


It is clear that g~! is a linear fractional transformation since da — (—b)(—c) = 


ad — bc # 0. It remains to show that the correct composition law holds. I leave this 
as an exercise to the reader. (See Exercise 2.2.2.) | 


While there is nothing particularly difficult about the proof of Lemma 2.2, where 
the function g~! came from is likely a little mysterious. In principle, we could have 
deduced it from the composition law we derived in Lemma 2.1. However, this would 
be messy. A more elegant description will present itself once we connect linear 
fractional transformations and matrices. 


Theorem 2.2 The set of linear fractional transformations M6D® (2) is a group if we 
take the group operation to be composition and 1(z) = (1-z+0)/-z+1) =z. 


Proof The group operation is closed by Lemma 2.1. It is associative since function 
composition is always associative. Itis clear that zis a linear fractional transformation, 
hence it is the identity. Inverses exist by Lemma 2.2. ol 
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This result helps shed light on why linear fractional transformations should be 
important—they form a group of transformations on a space, and so if we study what 
type of properties such transformations preserve, we will be studying a new kind of 
geometry. While this group might seem unfamiliar, I claim that it is intimately tied 
to another common group. 


Definition 2.5 The general linear group on C?, denoted by GL(2, C), consists of 
all 2 x 2 matrices with complex coefficients and non-zero determinant. That is, if 


ab 
M= 6 i) € GL(2,C), 


then det M = ad — bc £0. 


Remark 2.4 The term “linear” comes from “linear transformation.” This group is just 
the collection of linear transformations from C? to C? that are invertible, represented 
as matrices. 


Theorem 2.3 [f we take the operation to be matrix multiplication, then GL(2, C) is 
a group. 


Proof Write 
_ (ab ,_(ad’ 
m= (Ca) = (ea) 
a b\ (a b'\ _ faa’ +be' ab’! +bd’ 
cd) \c'd'}) ~~ \ca'+dc' cb'+dd')’ 


(aa' + bc')(cb’ + dd’) — (ab’ + bd’)(ca' + dc’) 
= aa'b'c +. aa'dd' + bb'cc’ + be'dd' 
aa'b'c — ab'c'd — a'bed' — be'dd' 
= aa'dd' + bb'cc' — ab'c'd — a'bcd' 
= (ad — bc)(a'd' — b'c') £0, 
we see that MM’ € GL(2, C). It is easy to check from the above that if we take 


10 
=(01), 


then it will satisfy the properties of an identity. The existence of inverses is similarly 
easy to check: 


(2) (22) =o aa) EC) 


and note that 


and since 
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1 d —b 
ad—bc \-c a 


is an inverse of our matrix. The only thing remaining is to prove that matrix multi- 
plication is associative. This is a standard linear algebra exercise. (One way to see it 
is that multiplication of matrices corresponds to composition of linear transforma- 
tions. Since composition of linear transformations is just function composition, it is 
associative. ) oO 


hence 


Now, there is something deeply surprising here, which the careful reader might 
have picked up on: the calculations that we used to compute the product of matrices 
look oddly similar to the calculations that we used to compute the composition of 
linear fraction transformations! In fact, we have unwittingly proved a very interesting 
result. 


Theorem 2.4 There exists a surjective map 


W : GL(2,C) > Mob°(2) 


ab me ree az+b 
cd cz+d 


with the property that ¥(M, M2) = ‘¥ (M1) o ¥(M2) for all M,, M2 € GL(2, C). 


Proof | leave this as an exercise to the reader. (See Exercise 2.2.4.) oO 


This correspondence gives a very handy computational tool. Rather than being 
forced to think about function composition to work out what linear fractional trans- 
formation g| 0 @2 gives, we can instead pass from g1 and @2 to their corresponding 
matrices, multiply those, and voila! The result gives the coefficients of yg; o g2. For 
example, if we take 


2z+i iz—1l-i 
; g2(z) = ————_,, 
iz—l 4 


then we can multiply the corresponding matrices and get the composition 


(; i ie —) 7 @ i) 
i-l/\l 0 —2 1-i 
3iz —2—2i 
—2z+1-i 
However, we have to be a little careful. Certainly, every single linear fraction trans- 
formation can be represented by a matrix in GL (2, C). However, this representation 
is not unique, since for any 2 € C%, 
az+b  haz+Ab 
cztd Acz+Ad’ 


gi(z) = 


(91 9 g2)(z) = 
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ab ha Ab 
*((ca))=" (Gein) 

Thankfully, there are standard ways to fix this issue. The intuitive idea is that you 
create a new group from GL (2, C), but one in which matrices that differ by mul- 
tiplication by a non-zero constant are treated as being one and the same. This new 
group is called PGL(2, C), and in a certain precise sense, it is exactly the same as 
Mob? (2)! (See Exercise 2.4.6.) 

In any case, this map and its computational aid merit additional discussion. Why? 
Because it has greater repercussions than just for this one particular group; it is a 
much more broadly applicable phenomenon. 


hence 


Definition 2.6 Let (G, *, 1G), (H, 0,177) be two groups. A function f : G > H is 
called a group homomorphism if for all g1, g2 € G, f(g1 * 82) = f(g1) ° f (gz). It 
is called a group isomorphism if additionally f is bijective. If there exists a group 
isomorphism between G and H, we say that they are isomorphic. 


It is easy to show that a group homomorphism is a group isomorphism if and only 
if it has an inverse and that inverse is also a group homomorphism. (See Exercise 
2.4.2.) We have just demonstrated an example of a group homomorphism: namely, 
the map ¥ : GL(2, C) > Méb°(2). We also hinted at a group isomorphism: a map 
PGL(2, C) — Méb°(2). But there are many, many other examples. For instance, 


1. R* (considered as a group under multiplication) is isomorphic to R (considered 
as a group under addition) via the map In : R* — R. (See Exercise 1.4.7.) 

2. For any two groups G, H, the map g : G > H defined by g(g) = 1y is a group 
homomorphism. 

3. If H is a subgroup of G, then the inclusion map H — G defined simply by 
h ++ hisa group homomorphism. (See Exercise 2.4.3.) 


We will see far more examples in later chapters, as well as in the exercises. The 
importance of group isomorphisms is perhaps a little easier to understand: two groups 
are isomorphic if and only if they are essentially “the same.” I mean this in the 
following sense: if f : G — H is a group isomorphism, then for any a,b,c € G, 
ab = cif and only if f(a) f(b) = f(c). This means that if we were to write out the 
multiplication tables of both G and H, they would look the same—we would just be 
relabeling the various group elements via f. Thus, important properties of groups 
are preserved by group isomorphisms. For example, if G and H are isomorphic, then 
one is abelian if and only if the other is abelian. (See Exercise 2.4.7.) 

However, our example of GL(2,C) — Mob°(2) shows that group homomor- 
phisms are important even if they are not isomorphisms. Yes, in general, group 
homomorphisms will not preserve all nice group properties like abelian-ness. How- 
ever, if f : G — H is a group homomorphism, you can still learn something about 
the multiplication table of H from the multiplication table of G, and vice versa. This 
leads us to the following general principle. 
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Philosophical Principle 


If you wish to study a mathematical object (such as the Euclidean plane or groups), 
don’t just study it in isolation. Instead, identify what are the transformations 
between this kind of mathematical object that preserve something important about 
it. (Such as how isometries preserve distance, or how group homomorphisms 
preserve multiplication.) 

> Example Let g(z) = —. Find the subset of linear fractional transformations 
y such that @ o y is a translation. 

We want that (g o w)(z) = z +b for some b € C. Passing to the corresponding 


matrices, we have 
1 2 1b 
( y= (i): 
or, taking an inverse, 


ue 2) AA\__1 (tire 
“Mi i+e) NOL) ~ Tse +e 17 \01 
IL fi-i-240 400 
a re 


Since we can freely multiply by scalars without changing the original linear fractional 
transformation, we can just ignore the factor of (1 — i)~! in front and conclude that 


[. . wee c} 
—iz+1-—ib 


is the desired family of linear fractional transformations. 


2.3 Circle Inversions 


The fact that M6b°(2) is a group suggests that we might be able to use similar 
reasoning to our approach in studying Sim(C) in Chapter 1. Namely, we will first 
break it apart into more simple transformations. We will understand those simple 
transformations as thoroughly as we can, and use the fact that Méb? (2) is a group to 
show that various properties preserved by simple transformations are preserved by 
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(d) (e) 


Fig. 2.4 An illustration of the decomposition in Theorem 2.5. (a) shows an initial configuration; 
(b) shows a translation of (a); (c) shows a rotation and scaling of (b); (d) shows the image of (c) 
under the map z +> 1/z; (e) shows a translation of (d). 


more complicated ones. What are these simple transformations? This is answered 
by the following theorem. 


Theorem 2.5 (Decomposition Theorem for LFTs on CP!) Any element in 
Mob" (2) can be written as a composition of rotations, translations, dilations, and 
the map z +> 1/z. 


Remark 2.5 A concrete example of this decomposition is shown in Figure 2.4. 


Proof Choose any g € Méb°(2), and write 
az+b 
cz +d 


This theorem is most convenient to prove in terms of GL(2, C) via Theorem 2.4, so 
we shall actually consider the related matrix 


ua(C)) 


We shall prove that M can be written as a product of matrices of the forms 


(01) (04) (vo): 


as, by inspection, such matrices correspond to translations, rotations/dilations, and 
zt 1/z, respectively. We are going to have two different cases: either c = 0 or 
c £0. If c = 0, then 


e(z)= 
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#=(64)= (a) (04): 


which proves what we wanted. If c 4 0, then we first note that 
1—£)\ (a b\ _ Op 4 
0 1 cd c d ‘ 
01\ (oo-“# c d c 0 1 
c = = 
10)\c d Ob- “4 g —24"e } \0 
01\ (1-4 e 4 1¢ 
(10) (0) ™ = (0 -s#2) (0): 


from which we get, by multiplying by inverse matrices, 


w=(51) (08) shan) (54). 


Passing to the corresponding linear fraction transformations, we precisely get the 
desired result. Oo 


and 


Rol 
— 


Thus, 


Remark 2.6 To students of linear algebra, our proof may seem vaguely familiar: it 
is essentially Gaussian row reduction with some minor alterations. 


We studied translations, rotations, and dilations extensively in Chapter 1, but the 
map ¢(z) = 1/z is new. What does it do to CP!? Well, there are two points where 
its action is completely obvious: 


9(0) = 00 (00) =0. 


That is, the map z +> 1/z interchanges 0 and the point at infinity. What about every 
other point? Any other z € CP! we can write as z = re’? for some r > 0, and then 
we check that 


Therefore, we see that z +» 1/z does two things: first, it moves points that are 
distance r away from the origin to points that are distance 1/r away from the origin; 
second, it does a reflection around the real axis. This makes it deeply tied to circle 
inversions. 


Definition 2.7 Let C be a circle in C with center zo and radius R. A reflection 
through C, also known as a circle inversion, is a transformation ® on CP! defined 
as follows: ®(z9) = co, B(co) = Zo, and for every other point z € CP!, take the 
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Fig.2.5 On the left, an illustration of the effect of a circle inversion on a single point: P is distance 
r from the center, and it is sent to ®(P), which is R2 /r from the center, where R is the radius of 
the circle. On the right, the effect of an inversion through the unit circle. The loop in blue is the 
original; the loop in green is its image after the inversion. 


ray from zo to z, measure the distance r between them, and send z to the point along 
that same ray that is distance R*/r away from zo. (See Figure 2.5 for an illustration.) 


It might not be clear why we refer to such a transformation as a “reflection” — 
we will show later that you can get reflections through lines as limits of reflections 
through circles in some sense. Indeed, reflections through circles and reflections 
through lines share various similarities. For example, just as reflections through 
lines fix a particular line and interchange the two areas on either side of it, reflections 
through circles do the same but with circles. Furthermore, they do this exchange in 
such a way that they are their own inverses. 


Theorem 2.6 Let ® be a reflection througha circle C. Then forall z € C, ®(z) = z, 
all points inside C are sent to points outside C, and all points outside C are sent to 
points inside C. 


Remark 2.7 This effect can be seen in Figures 2.6 and 2.7. 


Proof Let C have center zo and radius R. Choose any point z on C—by definition, z 
is distance R away from zo. The inversion ® will send z toa point distance R*/R = R 
away from zo along the same ray, which is to say that D(z) = z. If z = zo, then z 
is inside C, but ®(z) = o, which is outside C. If z = ov, then z is outside C, but 
®(z) = zo, which is inside C. For all other z € CP!, letr > 0 be its distance away 
from zo. If r < R, then z is inside C, but ®(z) will be distance R*/r > R away 
from zo, hence outside C. If r > R, then z is outside C, but O(z) will be distance 
R?/r < R away from zo, hence inside C. oO 


Theorem 2.7 Let ® be a reflection through a circle C. Then ® is its own inverse. 


Proof Let C have center zg and radius R. Then 
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Fig.2.6 On the left is the original image. On the right is its image under the reflection through the 
blue circle. 


®(D(z0)) = P(co) = zo ®(®D(co)) = O(Z0) = ~w. 


For any other point z, z is some distance r > 0 away from zo, so ®(z) is distance 
R*/r away from zo along the same ray. Ergo, ®(®(z)) is distance R*/(R?/r) =r 
away from zo along the same ray, which is to say that ®(M(z)) = z. Thus, we see 
that ®(@(z)) = z for all z € CP! and we have proved our claim. Oo 


The map z +> 1/z is not quite a circle reflection: it is a circle reflection composed 
with a line reflection. Well, technically this cannot possibly be true. After all, any 
line reflection is only defined on C, whereas we need it to be defined on all of CP!. 
There is an easy fix for this. 


Definition 2.8 A line reflection on CP! is a transformation ® : CP! > CP! 
defined as follows: restricted to C, ® is a reflection across some line /, and 
@(co) = oo. More generally, for any similarity ® : C — C, we can extend it 
to a transformation CP! + CP! by defining ®(00) = ov. 


Theorem 2.8 Define g(z) = 1/z as a function on CP!. Then g is a composition of 
inversion through the unit circle and a reflection across the real axis. 


Proof Let ® be a reflection through the unit circle, and ¢(z) = Z be the reflection 
across the real axis. Then ®(0) = om, and (¢ o ®)(0) = co = (0). Similarly, 
@(co) = 0, hence (¢ o ®)(oo) = 0 = gov). For every other z, we can write it as 
re’? for some r > 0. By the definition of ®, we have 


j 1, . 1 _. . 
O(re!”) = ~e (po ®) (re’?) = ne = o(re’’). 
Thus, do®=¢Q. Oo 


Conversely, any circle reflection can be expressed as a composition of a linear 
fractional transformation and a line reflection. 


Theorem 2.9 Let ® be a reflection through a circle C. Then 
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Fig. 2.7 On the left is the best cat. On the right is his image under the reflection through the blue 
circle. 


zoz + R* — |zol* 


— Zo 


O(z) = 


forallz€ CP!. 


Proof Any circle C with center zo and radius R is the image of the unit circle after 
a dilation g;(z) = Rz and a translation g2(z) = z + zo. If 93(z) = 1/z, we claim 
that 


92° 9109930(92091) | =. 


Intuitively, the idea is that we first change coordinates so that the circle C turns into 
the unit circle; then we do 93, which is a reflection through the unit circle; finally, 
we change coordinates back. This should be the same as reflecting through C. Now, 


(92 0 91 0 93 0 (92091) ') (z) = (2 09109309; 0 0;') (z) 
= (92 © 1 0 G30 07') (z — zo) 


( ) & — £0 
= ° fe) 
92° P1 O P3 R 


= (92 091) (—-) 
Z—Zo 


R2 R? 
on( )- + Z0 
Z— 20 L = 20 


zoz + R? = |zol? 
Z—Zo 


> 


and from this it is easy to see that zg KH» oo and co F> 2g, as expected. For all other 
z € CP!, we can write z = zo + re’? for some r > 0 and it is an easy computation 
that 
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Fig. 2.8 On the left is the unit square grid. On the right is its image under the map z + 1/z. 


_ 2 R2 , 
(2 2919930 Dy) ; O04 ') (zo +re'*) =Zot+ eae 


which is precisely what the action of ® should do. We are done. Oo 


p> Example Let 1, D2 be the circle reflections through the circles centered at 0 
with radii 1 and 2, respectively. What is D2 o D,? 

First, we note that (Bz o ®;)(0) = ®©2(co) = 0 and (®2 o B;)(co) = D2(0) = ov. 
Any other point z € CP!, we can write as re’?, and we see that 


(M2 0 ®)(re'”) = O) (e") = —e!9 = dre! 
r 


From this, we get that ®2(®,(z)) = 4z, which is a dilation. 


2.4 Generalized Circles 


From the various illustrations that we have given so far, we see that circle inversions 
are not isometries nor even similarities—correspondingly, neither is z +> 1/z. Recall 
that we showed that one of the defining properties of similarities was that they sent 
lines to lines and circles to circles. However, examples such as in Figure 2.8 show 
that while z +> 1/z sometimes sends lines to lines, it does not always. 

However, studying these illustrations carefully, one notices something surprising: 
while z +> 1/z doesn’t always send lines to lines, it looks like when it doesn’t, it 
sends them to circles! This is indeed true, but to prove it, it shall be convenient to 
introduce the concept of a generalized circle. 
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Fig. 2.9 A family of circles that seem to approach a line. 


Definition 2.9 A (generalized) circle in CP! is either a circle in C or a line in C 
union {00}. We often call lines circles through infinity. 


This definition can be motivated in various ways. One possible way is to observe 
that as the radius of a circle increases, in some sense, it can start to approach a 
line. Such a statement is technically meaningless without specifying a mode of 
convergence, but the intuition is clear from illustrations such as Figure 2.9. One 
could also motivate this definition in terms of the Riemann sphere—it is possible to 
show that both lines and circles on the sphere correspond to circles on the sphere 
via stereographic projection. In any case, in order to prove that z +> 1/z preserves 
generalized circles, we shall want an algebraic description of them that unifies the 
descriptions of circles and lines. 


Theorem 2.10 (Algebraic Description of Generalized Circles) Generalized cir- 
cles are precisely those curves in CP! that are solutions to equations of the form 
Azz+ Bz+ Bz+C =O, where A,C €R, B €C, and 


AB a 
det (5 6) = AC — BB < 0. 


Remark 2.8 There is an obvious question: how do we define whether or not 00 is 
a solution to such an equation? In general, questions like this involve appeals to 
projective geometry. For our purposes, we approach as follows: if we divide by zz 
on both sides, we get 
A+ z =F = + = = 0. 
Z 2 2 
In light of this, we will define oo as being a solution to this equation if 
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B BC 
lim A+—+—+—=0. 
z>00 Z Z kz 


It isn’t hard to see that this is true if and only if A = 0. 


Proof We know that any circle with radius R and center zg is the set of solutions to 
lz — zo| = R. But 


|z — zol” = (@ — zo) — 20) = lel” — Zz — 20% + [zol’, 
so we see that any circle is a solution to 
|zI? — Zoz — zozZo + Izol? — R* =0, 
which is of the desired form, since 
aet(_! a )=-# <0 
—zo |zo|? — R? 


Any line can be obtained as the set of solutions to |z — zo| = |z — z1| for some 
Zo # 21. (See Exercise 2.2.5.) Equivalently, 


Iz —zol? = Iz —zil? 
|Z? — Zz — zoz + lzol” = zl? — Zaz — 21Z + zal? 
(ar — Z0)z + @i — 20)Z + lzol” — [zi = 0, 
and as 
det (., “— ‘col? 7 = 2) = —|z1 — zl” <0, 
we see that this is also an equation of the desired form. Conversely, for any equation 
Az + Bz+ Bz+C=0, 
if A 4 0 we can divide by it to get a new equation 


eee, 
rea A A’ ries 


This is the equation of a circle with center zo and radius R, where 


B BB CC VBB—AC 
20=-T R= = > 0. 
A A2 A |A| 


If A = 0, then it is the equation of a line |z — zo| = |z — zi| where 
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z1-z0=B 
2 2 
Izol” — Izil* = C. 
Any such equation has solutions: for example, take 


BP +C, were: 
= —__j = ———— i. 
ACB) = 25(B) 
(See Exercise 2.2.5.) We have thereby shown that any equation of the given form 
corresponds to either a circle or a line. oO 


We can now prove that z +> 1/z preserves generalized circles. 


Lemma 2.3 Let (z) = 1/z and let y be a generalized circle in CP!. Then o(y) is 
a generalized circle in CP!. 


Proof By the algebraic description of generalized circles, we know that y is the set 


|: e CP! 


Azzt+ Be + Be +c =o} 


for some A, C € Rand B € C such that AC — BB < 0. The image of this set is the 
set 


1 1 
—€eCP 
Zz 


Az+B:+Be+C=0| = {ze 


1 1 —1 
rea & va 


= |-<ce' 


C+ Be+ B24 =O. 


Since CA — BB < 0, we see that g(y) is the set of solutions to a new equation 
describing a generalized circle. Oo 


Of course, this result is merely a stepping stone to what we want to show: all 
linear fractional transformations preserve generalized circles. 


Theorem 2.11 Let g € Méb°(2) and let y be a generalized circle in CP. Then 
9(y) is a generalized circle in CP!. 


Proof By the decomposition theorem for linear fractional transformations on CP!, 
we know that g is a composition of translations, rotations, dilations, and z +> 1/z. 
The first three preserve generalized circles by Theorem 1.6. By Lemma 2.3, we know 
that z +> 1/z does as well. Therefore, so does the composition of them all. Oo 


This result is of enormous importance for a number of reasons. First, this basic 
property of linear fractional transformations will be most useful when we shall inves- 
tigate how to give elegant proofs of various classical theorems in Euclidean geometry 
about lines and circles. Second, it will allow us to give simple definitions of things 
like angles and orientation without resorting to multivariable calculus or complex 
analysis. 
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> Example What is the image of the curve described by the equation 
227+ (3 +i)z+ 3 —i)Z +3 = 0 under the map z +> (z +i)?(iz)? 
Since the original curve was the set of solutions to 


{: ecp! 


2+ G++ G-I2+3=0), 


the image will be the set 


{ow e CP!/227+ B+iz+GB-iHzZ+3 =o} 


= |:<ce' 


20" (2)9 (2) + B+ )07'@+B-De1@4+3= of 


It isn’t hard to compute that 
7 i 
g'@=—., 
iz—1 
so the following equations are all equivalent. 


20 '()o (2) + B+i)9 (2) + B—-Doe-"() +3 =0 


a +28 (G+ )—— )+3=0 
iz—1l-iz—-1 iz—1l 
3|iz— 1)? +2R (B+i)i(-iz— 1) +2=0. 


If we write z = x + iy, the above can be expanded to 


2 


3liz— 1 +28 (3+ A i(-1Z— 1)) +2 =3(x? ++?) +2 —3y—3) 42 
= 3x7 42x +3y*-1=0. 


To make further progress, we complete the square. 


ae 1 
x24 Qe b3y? 12 3(a2 + Sx 3) 4392-1 5 


3 lV’. 3,2 0 
— Co = SS Ue 
4 os 


Finally, dividing by 3 on both sides and rearranging, 


tN aed 
ae? ak i 


which is the equation of a circle with center (1/3, 0) and radius 2/3. 
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Fig.2.10 An “R” and its image under the map z > 1/z. Note that while it is rotated and distorted, 
it is not flipped. 


2.5 Oriented Circles 


We defined orientation in Chapter | in a way that made use of the fact that similar- 
ities preserve circles. At that time, the concept of orientation made perfect intuitive 
sense: things that were orientation-reversing behaved like mirrors, while orientation- 
preserving transformations didn’t. But, of course, one can have a curved mirror, and 
that too should be “orientation-reversing” in some sense, as in Figure 2.10. On the 
other hand, we know that the map z +> 1/z is acomposition of a reflection through 
a circle and a reflection through a line, and so it should be orientation-preserving. 

We could introduce ideas from multivariable calculus in order to define a notion 
of orientation that would apply to all real differentiable maps. This would certainly 
include linear fractional transformations in particular. We proceed more simply: we 
shall take Definition 1.4—which applied to similarities—and alter it just enough for 
it to apply to maps that preserve generalized circles. Before we do that though, we 
will first have to define oriented circles. 


Definition 2.10 An oriented circle C in CP! is a generalized circle together with a 
direction in which it is traversed if considered as a path. We will write —C for the 
generalized circle, but with the direction of travel reversed, and we shall say that this 
oriented circle has the opposite orientation. Any generalized circle splits CP! into 
two connected regions; the region to the left of the path as it is traversed shall be 
termed the interior, written as Int(C); the region to the right of the path shall be termed 
the exterior, written as Ext(C). We therefore have the relations Int(C) = Ext(—C) 
and Ext(C) = Int(—C). 
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Fig.2.11 Two oriented circles, with their interiors shaded. 


The intuitive picture behind this definition is shown in Figure 2.11. While I gener- 
ally strive for mathematical rigor, this is one case where I think a slightly informal— 
but deeply intuitive—definition is entirely justifiable. The reader who finds this to 
be unacceptably sloppy should bear in mind, though, that for circles and lines, we 
can fix particular kinds of paths, such as those of the form f bt» Re*"' for circles, 
which allows us to unambiguously define what we mean by direction and what we 
mean by “to the left of”. 

Using the language of oriented circles, we can give a definition of orientation- 
preserving and orientation-reversing maps that will apply to transformations that 
preserve generalized circles. 


Definition 2.11 Let 9 : CP! + CP! bea transformation that 


1. maps oriented circles to oriented circles and 
2. for any oriented circle C, the image of Int(C) is either Int(g(C)) or Ext(g(C)). 


We say that » is orientation-preserving if for any oriented circle C, g(Int(C)) = 
Int(g(C)). We say that @ is orientation-reversing if for any oriented circle C, 
g(Int(C)) = Ext(g(C)). 


Remark 2.9 The second restriction in Definition 2.11 may potentially feel a little 
artificial. It can be replaced with the following, simpler requirement: g must be 
continuous with a continuous inverse. 


While this definition is perhaps a little harder to digest than Definition 1.4, I claim 
that it is nothing more than a generalization. Indeed, this new definition reduces to 
the old one in the case where ¢ is a similarity. This will be easiest to prove using the 
following lemma. 


Lemma 2.4 Consider functions 91, 92: CP! + CP!. Allof the following are true. 


1. If 91, 02 are orientation-preserving, then @, © 92 is orientation-preserving. 

2. If 91, ¢2 is orientation-reversing, then 9, 0 @2 is orientation-preserving. 

3. If one of 91, @2 is orientation-preserving and the other is orientation-reversing, 
then @ © @2 is orientation-reversing. 
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Proof Choose any generalized circle C. If g1, g2 are both orientation-preserving, 
(~1 0 pz) (Int(C)) = gi (Int(y2(C))) = Int ((y1 © g2)(C)). 


I leave the other cases as an exercise for the reader. (See Exercise 2.2.6.) oO 


Theorem 2.12 Lety : CP! — CP! be a similarity (with the usual extension that 
(co) = ov). Then ¢g is orientation-preserving/-reversing in the sense of Definition 
2.11 ifand only if it is orientation-preserving/-reversing in the sense of Definition 1.4. 


Proof By the decomposition theorem for complex affine maps (Theorem 1.1), the 
classification of orientation-preserving and orientation-reversing similarities (Theo- 
rem 1.7), and Lemma 2.4, we know that it suffices to prove this result for maps like 
zeeztbzRerzzP Fz, and z +> z. We already know that the first three are 
all orientation-preserving in the sense of Definition 1.4, and the fourth is orientation- 
reversing. We will be done if we show that the same is true using Definition 2.11. 
Choose any oriented circle C. There are three cases. 


1. C isacircle with center zg and radius R, traversed counter-clockwise: its interior 
is the set of all points z such that |z — zo| < R. 

2. C isacircle with center zo and radius R, traversed clockwise: its interior is the 
set of all points z such that |z — zo| > R. 

3. C is a line passing through the point zo and traversed in the direction e!”: its 
interior is the set of all points z such that z = zo + e!7t + —ie!’s for somet € R 
and some s > 0. 


Inthe firstcase,z BH z+b,zrerz,zpe! 9 all move C toacircle traversed counter- 
clockwise, and the interior is the set of points |z — g(z)| < R—therefore, they are all 
orientation-preserving, since this set is the image of the interior of C. However, z +> Z 
produces a circle traversed clockwise, and which therefore has interior |z — Zo| > R. 
The image of |z — zo| < Runderz +> zo is |z — Zo| < R whichis the exterior of the 
image of C—therefore, z +> Z is orientation-reversing. The other cases are similar, 
and so I leave them as an exercise to the reader. (See Exercise 2.2.7.) oO 


As we expect, all linear fractional transformations on CP! are orientation- 
preserving. To prove this, we need an important lemma. 


Lemma 2.5 Under stereographic projection, the map 9(z) = 1/z corresponds to a 
rotation of the unit sphere by x radians around the real axis. That is, 


stereo! o Q 0 Stereo : ris 


(x, y, 2) ad (x, —y, —z). 


Remark 2.10 This correspondence is likely easier to see via an illustration of a disk 
on the sphere being rotated, as in Figure 2.12. 
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~x » ~ 


Fig.2.12 The stereographic projection of an oriented circle onto the sphere, as the sphere is rotated. 
Everything has been sliced by a plane for easier viewing of the interior. 


Proof This is equivalent to proving that g(stereo(x, y, z)) = stereo((x, —y, —z)). 
Since stereo(0, 0, 1) = co and stereo(O, 0, —1) = 0, if z = 1, then 
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g(stereo(x, y, z)) = g(o) = 0 = stereo((x, —y, —z)); 
similarly, if z = —1, then 

g(stereo(x, y, z)) = g(0) = co = stereo((x, —y, —z)). 
If z A +1, then 


» Gerolx, ¥,2)) = ¢ if ”) _i-z _ -2@-¥y) 


l-z xtiy x2+y? 
(l—z)(x-iy) x-iy 
> 1-2? 7 fag 
But this is the same as stereo((x, —y, —z)). Oo 


Theorem 2.13 Let 9 € Méb°(2). Then 9 is orientation-preserving. 


Proof In light of the decomposition theorem for linear fractional transformations 
on CP!, Lemma 2.4, and Theorem 2.12, it suffices to prove that g(z) = 1/z is 
orientation-preserving. This is most easily seen via Lemma 2.5—any oriented circle 
C in CP! has an image which is some curve y on S$”. If the interior of C is on the 
left of the curve, then the interior of y will be on the right. Rotating the sphere will 
move y to some new curve, but the interior will still remain on the right-hand side. 
But then after reversing the projection, we get a new oriented circle g(C) whose 
interior is on the left-hand side. Oo 


> Example Does there exist 9 € M6b°(2) such that for some t > 1, (0) = 0, 
g(1) = 1, g(co) =t, and 3 (g(i)) < 0? 

Suppose that there was such a g. Consider R U {oo}—this is a generalized circle 
that passes through 0, 1, oo. Its image under g must be a generalized circle passing 
through 0, 1, —which is to say that it must be R U {oo} again. In fact, if we give 
RU {oo} an orientation by saying that we must traverse it from left to right (i.e. from 
0 to | to oo), then its image under g must also be traversed in that same direction 
(i.e. from 0 to 1 to f). Buti is in the interior of R U {oo} given this orientation, and 
g(i) is not! This is a clear contradiction, ergo there is no such 9 € M6b°(2). 


2.6 Angles, Revisited 


We previously showed in Chapter | that all similarities preserve angles between 
lines. Illustrations like Figures 2.8 and 2.10 certainly suggest that in some sense, 
maps like z +» 1/z still preserve angles, even though they don’t preserve lines in 
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Fig.2.13 On the left is a family of generalized circles all tangent at one point zo. On the right, two 
generalized circles intersect at zg at an angle @ defined by their tangent lines. 


general. We can make this precise as follows: let yj, y2 : [0, 1] — C be two curves 
in the complex plane that intersect at some point zo. We define the angle between 
¥1, 2 at Zq to be the angle @ between their tangent lines at zo, as in Figure 2.13. We 
would like to say that a map ¢ is preserves the angle between y; and y2 if the angle 
between the tangent lines of g o y; and g o y2 at g(zo) is also a. Of course, for this 
definition to make sense, you need to make sure that the images of y; and y2 under 
g are curves with well-defined tangent lines. We will eliminate this worry simply by 
always choosing our curves 7, 72 to be generalized circles. 

There is another, more thorny issue with this definition: it forces the point of 
intersection to lie inside of C rather than C P!. This is inconvenient and against the 
general philosophy that oo is just like any other point in CP!. Thankfully, there is 
an obvious way to define angles at infinity by exploiting a property that is true of 
circles: they are either tangent or they intersect in two distinct points, in which case 
the angles of intersection are the same at both points. (See Exercises 2.2.8 and 2.2.9.) 


Definition 2.12 Let C,, Cz be two generalized circles that intersect at oo. If C1 
and C2 are tangent, define the angle of intersection to be 0. Otherwise, there is 
another intersection point z9—define the angle of intersection at 00 to be the angle 
of intersection at zg. 


This definition is going to make everything wonderfully convenient for us—in 
particular, with this definition, we will be able to show that all elements in Mob? (2) 
are angle-preserving. To get to that point, though, we will need a collection of lemmas. 
We begin with the observation that if we choose any generalized circle C and a point 
zo on C then there will be an infinite family of generalized circles tangent to C at 
that point. Note that the definition of angle that we have chosen doesn’t care about 
which circle in this family we select. 


Lemma 2.6 Let C,, C2, C3 be generalized circles that all intersect at a point zo € 
CP}. If C1 is tangent to C2, then the angle between C, and C3 is the angle between 
C2 and C3. 
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Zo 


Fig.2.14 A collection of circles tangent at zo. One of them passes through a second point z1. 


Proof Let 11, lz, 13 be the tangent lines to C;, C2, C3 at zo. Clearly, /; = /2. But this 
means that in both cases, the angle is simply the angle between /; and /3. Oo 


That we can make this replacement makes life much easier because we can choose 
a member of a family of tangent circles that has convenient properties such as passing 
through a particular point. 


Lemma 2.7 Let C be a generalized circle and zo a point on C. For any z, € CP}, 
there exists a generalized circle C, that is tangent to C at zy and which passes 
through z1. 


Proof This statement is visually obvious, as illustrated in Figure 2.14. I leave the 
proof as an exercise for the reader. (See Exercise 2.2.10.) oO 


The last component that we need to proceed with a proof is to understand how 
linear fractional transformations interact with our chosen definition of angle. 


Lemma 2.8 LetC,, C2, C3 be generalized circles that all intersect at a point zo € C. 
Let 9 € Méb°(2). If Cy is tangent to C2, then y(C) is tangent to y(C2) and the 
angle between (C1) and y(C3) is the angle between g(C2) and g(C3). 


Proof We know that y(C1), g(C2), g(C3) will be generalized circles intersecting at 
g(zo). If C; = Cp then y(C ,) = g(Cz); alternatively, if the intersection between 
C, and C? is just zg, then the intersection between y(C;) and y(C2) must be g (zo) 
since g is bijective. Therefore, in either case, g(C1) is tangent to g(C2) at p(zo). By 
Lemma 2.6, we know that the angle between g(C1) and g(C3) is the angle between 
g(C2) and g(C3). Oo 


With this in mind, it will now be comparatively easy to prove that elements of 
Mob? (2) are angle-preserving. 


Theorem 2.14 Letg € Mob" (2). For any two generalized circles C,, C2 that inter- 
sect at a point zo € CP!, the angle of intersection of C, and C3 is equal to the angle 
of intersection of p(C\) and o (C2). 
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(a) (b) 


(c) (d) 


Fig. 2.15 A visual sketch of the proof of Theorem 2.14. (a) shows the initial configuration with 
C1, C2 intersecting zo; in (b), we exchange both circles with tangent ones C3, C4 that pass through 
the origin O; in (c) and (d), we replace those circles with their tangent lines Cs, C6. 


Proof It is easy to see that if this is true for g1, 92 € Méb°(2) then it must be true 
for 91 © @2. Since we already know that translations, rotations, and dilations are all 
angle-preserving, it shall suffice to prove that z +> 1/z is angle-preserving. Since 
we know that z +> Z is angle-preserving, it will suffice to prove that z +> 1/z, the 
reflection through the unit circle, is angle-preserving. 

Choose any two generalized circles Cy, Cz with acommon point of intersection zo. 
By Lemmas 2.6 and 2.7, we can choose some generalized circles C3, C4 such that 


1. C3, C4 are tangent at zy to Cy and Co, respectively; 

2. C3 and C4 pass through 0; and 

3. the angle of intersection between C), C2 is the same as the angle of intersection 
between C3 and C4. 


I refer the reader to Figure 2.15 for a diagram of this new configuration. By Lemma 
2.8, we know that if the angle of intersection between C3 and Cy is preserved, then 
it is preserved between C; and C2. On the other hand, we know that 0 is another 
common intersection point of C; and C2, and therefore the angle of intersection 
at that point must also be the same. Furthermore, if that angle of intersection is 
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Fig.2.16 An angle-preserving transformation that does not preserve generalized circles. 


preserved, then so is the angle at zo. The final reduction is as follows: we can replace 
C3 and C4 with two generalized circles C5 and C6 that are tangent to them at 0 and 
which both pass through oo—that is, Cs and C¢ are both lines passing through 0. 
Of course, the angle between these two lines must be the same as the angle between 
Cs and C¢, and if this angle is preserved by z +> 1/z, then the original angle is also 
preserved. However, it is easy to see that the reflection through the unit circle simply 
fixes lines through the origin and so this final angle is preserved. Oo 


Remark 2.1] This approach does not generalize to discussing angles of intersection 
between arbitrary curves. Linear fractional transformations preserve those too, but 
proving it requires some knowledge of the derivatives of maps R” —> R* or, even 
better, complex analysis. For the latter approach, I refer the interested reader to 
Needham’s Visual Complex Analysis [11]. 


While the fact that elements of Méb°(2) are angle-preserving is a special prop- 
erty, the reader should not be mistaken in thinking that these are the only types of 
transformations that have this property—there is a very large family of functions of 
interest in complex analysis that all possess this same quality. An example of such a 
function is depicted in Figure 2.16. 


2.7 The Cross-Ratio 


When we introduced maps z +> az +b, it was in the context of transformations that 
preserve distances or ratios of distances. A reasonable question to ask is whether there 
is some similar type of quantity that is preserved by linear fractional transformations. 
And, indeed, there is! To motivate the definition, let us recall that we defined a 
similarity as being a map © such that for any triple of points z1, z2, z3, 


Iz1—z2| _ |®(Z1) — G&2)I 
Iz1—z3]_  |®(z1) — 3) 
However, we could also write 
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lzi—zal* — @1 —z2)@1 — 2) _ (2 - 22) é = 
la-3l? (@-za)@—-z) \a-2/ \a-23 

and with this inspiration we might notice that actually for any triple of points z1, z2, 23 

it will be true that 


a ae O(z1) — B(z2) 
Z—23 «= ®(z1) — ®(z3) 


for any orientation-preserving similarity ©, and 


i a (Fe = 4) 
z—7z3. \ @(z1) — ®(z3) 

for any orientation-reversing similarity ©. It is easy enough to check that this is not 

always true for elements of Méb°(2), but there is an easy generalization that does 

work. 


Definition 2.13 For any four distinct points z, z2, z3, Z4 € CP!, their cross-ratio 
is defined to be the complex number 


£224 


22-23 if 21 = 00 
21-23 : = 
21-24 if 22 = 00 
[ * ] _ £2—%4 if a 
£1, 22; £35 24 = Zy—Z4 1 23 = ©CO 
£1—-Z3 1 — 
A if z4 = CO 
(c1=23)(22=24) otherwise. 


22-23) (Z1—Z4) 


Remark 2.12 Ifthe piecewise definition is unappealing, one could always define this 
in terms of a limit as 
(w1 — w3)(w2 — wa) 


lim : 
(w1,w2,w3, 04) (Z1,22,23,24) (W2 — wW3)(W1 — wa) 


[Z1, 223 23, Z4] = 


Alternatively, one way to remember what the cross-ratio is is to write down 


(z1 — z3)(z2 — Za) 

ey = <3) (21 — 24) 
but if any of z1, Z2, 23, Z4 1S Oo, simply remove the factors in the numerator and 
denominator that contain it. 


Theorem 2.15 Elements of Méb°(2) preserve the cross-ratio, in the sense that if 
Z1 22, 23, £4 are four distinct points in CP! and g € Méb° (2), then 


[z1, 22; 23, 24] = [p(z1), G(Z2); (23), v(z4)]. 
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Proof First, we note that all orientation-preserving similarities preserve the cross- 
ratio. This is because 

(i <5) Gea) _ Sey. ee 

(22 —z3)(@1—z4) 22-23 1 — 4 
and we already saw that orientation-preserving similarities preserve expressions of 


these forms. Thus, it will suffice to prove that y(z) = 1/z preserves the cross-ratio. 
This is an easy computation: 


— (/z1 — 1/23) /z2 — 1/24) 
© (1/22 — 1/z3)(/z1 — 1/24) 
_ a= sae = 22) 

(23 = z2)(z4 — 21) 

_ Gi- tala — 4) 

2 — 23)(Z1 — 24) 

= [Z1, 225 23, Za]. 


[p(z1), p(z2); (z3), (z4)] 


Technically, this computation is correct on the nose only if none of z1, Z2, 23, z4 are 
either 0 or oo. However, since we can define both g and the cross-ratio in terms of a 
limit, this is irrelevant. oO 


This proof has a string of important corollaries. 
Corollary 2.1 For any triple of distinct points z1, z2, 23 € CP!, there is exactly one 


y € Méb" (2) such that p(z1) = 0, o(z2) = 1, o(z3) = ©. Specifically, 
LO = 2 Le) 


e(z= . 
72-21, 2-23 


Proof Suppose there is such a transformation g. Choose any point z 21, Z2, 23. 
By Theorem 2.15, we know that 


[p(z), 1; 0, 00] = [p(z), o(Z2); e(z1), P(Z3)] = Lz, 223 21, 23]. 


However, 


—0 
[p(z), 1; 0, oo] = me = 9(z) 


and 


; (e243) 28h ei 
[z, £23 41 z3] = = . > 
(22 -—z1)(— 73) 22-21 2-23 
and therefore 
22-23 Z-Z1 


e(Z= , 
£2 £1 £.75> £3) 
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Fig.2.17 An illustration of how linear fractional transformations allow us to take a triple of points 
and move it to any other triple. 


It is easy to check that this is an element of M6b° (2) with the desired properties. 0 


Corollary 2.2 For any triple of distinct points w,, w2, w3 € CP!, there is exactly 
one » € Méb°(2) such that o(0) = w1, 9(1) = w2, 9(oo) = w3. Specifically, 


g(z) = 


(w1 — w2)w3z + (w2 — w3)w1 


(w, — w2)z + w2 — wW3 


Proof If (0) = 1, 9(1) = wo, and g(oo) = w3,then g~!(w,) = 0,97 !(w2) = 1, 
and po '(w3) = oo. We know that there is exactly one transformation with that 
property, so 

w2—-W3 Z—-WI 


—l 
g (@)= 
W2—-W, %Z—W3 


Using the techniques we developed for computing inverses of linear fractional trans- 
formations, it is not too difficult to show that 


g(z) = 


as was claimed. Oo 


(w1 — w2)W3z + (wW2 — w3)w1 


(w, — w2)zZ + wW2 — W3 


Corollary 2.3 For any pair of distinct triples of points z,, Z2, 23 and w 1, W2, W3 
in CP!, there is exactly one y € M6b°(2) with the property that o(z;) = w; for 
i= 1,2,3. 


Remark 2.13 An example of linear fractional transformations moving triples of 
points is illustrated in Figure 2.17. 


Proof Choose the unique elements g},~2 € Méb°(2) with the properties that 
1(z1) = 0, 91 (22) = 1, 1 (Z3) = 00 and g2(0) = wi, g2(1) = w2, 92(00) = w3. 
Then g = 2 © g has the property that y(z;) = w; fori = 1,2, 3. Now, suppose 
that @ € Mob? (2) satisfies @(z:) = w; fori = 1,2,3. Define 9) = g5' og. Then 
—1(Z1) = 9, 91 (Z2) = 1, and 91 (z3) = 00, so 1 = 91. Therefore, 9 = 92091 = 9g. 

| 
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Corollary 2.4 Let ® : CP! — CP! be any map that preserves the cross-ratio. 
Then ® € Mob° (2). 


Proof First, we note that it must be that © is injective—otherwise 
[P(z1), P(Z2); B(z3), O(Z4)] 


won’t even be defined in general. Thus, w; = O(0), w2 = ®(1), and w3 = O(c) 
are distinct points. Therefore, there is some g € Méb°(2) such that p(w) = 0, 
y(wz) = 1, and g(w3) = oo. This implies that ® = g o © is a transformation that 
preserves the cross-ratio and has the property that ®(z) = z for z = 0, 1, 00. Choose 
any z € CP! other than 0, 1, 00, and note that it must be true that 


z=[z, 1; 0, 00] = [®G@), 1; 0, 00] = 6G). 
This implies that @ = g~! € Méb°(2), as desired. Oo 


In short, linear fractional transformations on C P! are exactly the transformations 
on CP! that preserve the cross-ratio; furthermore, they give exactly enough freedom 
to move any three distinct points to any other set of three distinct points. The first 
statement gives us a broad philosophical idea of why linear fractional transformations 
should be important or natural; we will see in the next chapter that the second 
statement is extremely useful for writing proofs about Euclidean geometry. 


p> Example Show there exist z1, 22, 23, z4 € C such that [z1, 22; 23, z4] = 4 if and 
only if 4 40,1, . 

Since linear fractional transformations preserve the cross-ratio and can move any 
triple of points to any other triple, we may assume without loss of generality that 
z2 = 1, z3 = 0, z4 = 00, hence the condition is satisfied if and only if 


z—0 
1-0 


2 = [21,223 23, Z4] = = Zs 


for some z; # 0, 1, oo. 


2.8 The Group of Mobius Transformations 


Before we end this chapter, I want to finally explain why we have been using the 
notation M6b?(2) to stand for the linear fractional transformations on CP!. The 
notation is reminiscent of our notation for similarities from Chapter 1, and so the 
reader might correctly guess that M6b° (2) is the collection of orientation-preserving 
transformations of some larger group. This larger group is the collection of Mobius 
transformations. 
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Definition 2.14 The group of Mobius transformations in CP'!, denoted by Méb(2), 
is the collection of all transformations ® : CP! — CP! such that ® can be written 
as compositions of circle and line inversions. 


This definition departs slightly from some common conventions in the literature. 
Most often, the term “Mobius transformation” is used to denote what I have termed 
“linear fractional transformation on C P!” (although that term is also used). However, 
it is possible to talk about, say, inversions through a sphere, or even an n-sphere, and 
to consider the group of transformations that can be obtained as compositions of such 
inversions. This is relevant for higher-dimensional hyperbolic geometry, for example. 
In that context, that group is usually called the group of Mobius transformations. Of 
course, that larger group contains elements that are not orientation-preserving, and 
so this conflicts with the somewhat more traditional usage. For convenience, I have 
opted to call the group obtained by allowing compositions of n-sphere inversions 
Mob(n), or the group of Mobius transformations in n-dimensional space. Of course, 
we should check that this really is a group. 


Theorem 2.16 With function composition as the operation, Mob(2) is a group. 


Proof We know that function composition is associative. Furthermore, we know that 
the identity function z has the properties of a group identity. We only need to check 
that compositions of elements in Mob(2) are still elements in Mob(2) and that they 
have inverses. Both assertions are easy to verify—for any ®;, ®2 € Mob(2), write 

DM; = 91°920...9m 

Or = Wo W20... Wn, 
where all of the g;’s and y;’s are inversions. Then 

D,0D2 = G1 0...9m OW 0... Wn € Mob(2) 
and 
0; = (91 0920...9m) 7! 
=9,,'0...95' 09, € Méb(2). 


This concludes the proof. Oo 


We can now justify using the notation Méb? (2) to denote the collection of linear 
fractional transformations on CP!. 


Theorem 2.17 The set Mob(2) can be partitioned into a subset of orientation- 
preserving transformations and a subset of orientation-reversing transformations. 
The set Méb°(2) is precisely the set of orientation-preserving Mébius transforma- 
tions. Any orientation-reversing element can be uniquely written as p 0 conj for some 
y € Mob" (2), where 


2.8 The Group of Mobius Transformations 67 


conj: CP! > CP! 


Ze> Z. 


Proof Since any circle or line reflection is orientation-reversing, any element ® € 
Mob (2) is orientation-reversing if it is a composition of an odd number of reflections, 
and orientation-preserving otherwise. Next, we’ll show that M6b°(2) sits inside of 
Mob(2). It shall suffice to prove this for translations, rotations, dilations, and the map 
z+» z.By Theorem 1.9, we know that every isometry can be written as a composition 
of line reflections—in particular, this applies to z > z+bandz + ez. We already 
saw that z + Z is a composition of a circle reflection and a line reflection, so this 
leaves the dilations. For any 2 > 0, consider the inversions through the circles 
|z| = V2 and |z| = 1—call these g; and 2 respectively. Note that 


(g1 © g2)(0) = 0 (91 © g2)(00) = C0 


and for any z = re? withr > 0, 
; 1. . 
(Ce 2)(re'”) =| (<<) = dre’®, 
Fr 


whence we have (g; 0 g2)(z) = Az. It remains to show that all orientation-preserving 
maps in Mob(2) are linear fractional transformations. By Theorem 2.9, we know 
that every circle reflection is a composition of a linear fractional transformation and 
Z +» Z; we know from Chapter | that all line reflections are compositions of a linear 
fractional transformation and z +> Z as well. Notice that if 


(y= Et" 

Cee gaat a 
then 

i az+b 

> ay 

— a+b 

g(z) = ——. 

cz+d 


This means that if we define (by slight abuse of notation) 
conj : Méb?(2) > Mob? (2) 


( =") az+b 
Zk ke Zk = 
cz +d cz +d 


then for any g € Méb?(2), 


conj o g = conj(g) o con]. 
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Thus, if we write 
D = g| OCONj oO G2 OCONjO... My O CONj 
for some 91, 92,...@n € Mob? (2), then we can rewrite it as 
® = g; o conj(g2) o conj o conj 0 30... My 0 CON; 
_ ie 0 conj(g2) 0 930...9, 0 con} ifn is odd 
91 0 cOnj(Y2) 0 930...conj(g,) ifn is even. 


Therefore, every orientation-preserving map in Mob(2) is a linear fractional trans- 
formation, and any orientation-reversing map is a composition of a linear fractional 
transformation and conj. Oo 


This result allows us to characterize Mob(2) in a different way. 


Corollary 2.5 The set of Mobius transformations on C P! is exactly the set of maps 
® : CP! + CP! that either preserve the cross-ratio or conj o ® preserves the 
cross-ratio. 


Proof This is an immediate consequence of Corollary 2.4 and Theorem 2.17. O 


To sum up, we know that if ® € Mob(2), then for any distinct quadruple of points 
Z1s 225 23,24 € CPI, 


[z1, 22; 23,24] if ® € Mdb?(2) 
[D(z1), B(z2); B(z3), O(z4)] = 
[z1, 223 23,24] otherwise. 


This has the obvious corollary that any element of Mob(2) will preserve the quantity 
— izi= za3llz2—z4l _ deuctia(Z1, 23) deuctid (Z2, 24) 

[za —z3|lz1 — 24] dBuctia(Z2, 23) Buctia (Z1, 24)” 
which is also sometimes referred to as the cross-ratio. One may well ask whether 


Mobius transformations are the only kind of transformations that preserve this quan- 
tity, and indeed they are. 


I[z1, 22; 23, Za] 


Theorem 2.18 (Cross-Ratio Characterization of the Mébius Group) The set 

of Mébius transformations on CP! is exactly the set of transformations 

® : CP!—CP! such that for all distinct quadruples of points 2, 22, 23, 24, 
Akuctid(Z1, 23)deuctid(22, 24) _ dkuctid(P(Z1), B (23) )deuctid( (22), ® (Z4)) 
déuclid(Z2, 23)dEuctid(Z1, 24) — CEuclid(® (22), ® (23) )dEuctia(® (1), B(Z4)) 


Proof We just showed that all Mobius transformations have this property. On the 
other hand, let ® be a transformation that preserves this cross-ratio. By composing 
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with Mobius transformations if necessary, we can assume without loss of generality 
that O(0) = 0, ®(1) = 1, and ®(co) = om. But this means that for any z € 
CP!\{0, loo}, 
Iz] = I[z, 1; 0, co]| = |[®(), 1; 0, o0]| = |®(@)| 
Iz — 1] = |[z, 0; 1, c0]| = [[®(), 0; 1, o0]| = |B) — 1]. 
From this, it is easy to deduce that for any such z, either ®(z) = z or O(z) = Z. 


Therefore, by composing with conj if necessary, we can assume that ®(i) = i. But 
this means there is an additional restriction 


[z — i] = |[z, 0; i, co]| = [[B(z), 0; i, co]| = |B) — i]. 
This forces ®(z) = z for all z € CP!, and we see that ® € Méb(2). Oo 


While I titled this chapter “Inversive Geometry,” I never explained what this 
actually is. With these final theorems, however, one can give a fairly straightforward 
definition: inversive geometry is the study of what is preserved by circle reflections 
or, equivalently, the types of transformations that preserve the cross-ratio. 
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Problems 


2.1 COMPUTATIONAL EXERCISES 


1. 


For each of the following, compute the image under g : CP! > CP!. 


a) Line y = x, p(z) = 7S. 
b) Line x = 4, g(z) = WE 


iz—l 


- — (142i)z+1~2i 
c) Circle |z| = 1, g(z) = =e : | 
d) Circle |z — 3|? = 2, p(z) = G12e = 
2. a) Find the angle between the line y = x +4 and the circle |z—243/2i|* = 1/2. 
b) Find the images of the above-mentioned line and circle under the map 
iz+2+42i 
Zhe - = 
(+i)z+4+i 
What is the angle between the images? Does it match your result from the 
previous part? 

3. For each of the following pairs of triples (z1, z2, 23), (W1, W2, w3), find an ele- 
ment g € Méb"(2) such that g(z;) = w; fori = 1, 2,3. 

a) (0, 1 + i, 1— i), (0, 1, Oo). 

b) ©, 1, co), @, —i, 1). 

c) (0, 1 ae i, 1- i), (i, —i, 1). 
2.2 PROOFS 

1. Prove that if a,b,c,d € C and ad — bc = 0, then (az + b)/(cz + d) is either 
undefined, or is some constant that does not depend on z. 

2. Prove that if a,b,c,d € C, ad — bec # 0, o(z) = (az + b)/(cz + d), and 
g\(z) = dz — b)/(—cz + a), then p(p~!(z)) = 97! (—(2)) = < for all 
zeCpP!, 

3. Finish the proof of Theorem 2.1. 

4. Prove Theorem 2.4. (Hint: Study carefully the proofs of Lemma 2.1 and Theorem 
Zio: ) 

5. a) Prove that R is the set of complex points z such that |z — i| = |z +i]. 


b) Prove that any line / in C can be obtained as the set of solutions to |z — zo| = 
|z — z1| for some complex numbers zo 4 21. (Hint: You may want to use an 
isometry to reduce to the case 1 = R.) 


2.8 


ad 
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c) Prove that if 


IBP+C. pile 
20 = sa! = sas 
° “25(B) ; 25(B) 
then the line |z — zo| = |z — zj| is the set of solutions to the equation 


Bz+Bz+C=0. 
Finish the proof of Lemma 2.4. 
Finish the proof of Theorem 2.12. 
Our goal is to prove that any two generalized circles intersect at either no, one, 
two, or all points. 


a) Prove that for any two lines L1, L2, if they share two points of intersection, 
then Ly = Lo. 

b) Prove that for any generalized circles C1, C2, if they share three points of 
intersection, then Cy = C). (Hint: Let z be one of the points of intersection. 
Consider taking the image under a circle inversion centered at z. What will 
the image of C, and C2 be?) 

c) Prove that for any two generalized circles C;, C2, if C; and C2 are not tangent, 
then either they don’t intersect, or they intersect in two points. 


a) Let C, be the unit circle, and let Cz be a circle with center z > O which 
intersects C, in exactly two points. Prove that the angles of intersection at 
these two points are the same. (Hint: What is the effect on C, and C2 if we 
take the image under the map z +> Zz? What does it do to the two points of 
intersection?) 

b) Let C,, C2 be two circles that intersect in exactly two points. Prove that the 
angles of intersection at these two points are the same. Hint: You may want 
to use a similarity to reduce to the case where C is the unit circle and the 
center of C2 lies on the positive x-axis. 

c) Let Cy be the unit circle, and let C2 be a line x = xg for some xg > O which 
intersects C; in exactly two points. Prove that the angles of intersection at 
these two points are the same. (Hint: What is the effect on C and C2 if we 
take the image under the map z +> Zz? What does it do to the two points of 
intersection?) 

d) Let C; be a circle and let C2 be a line that intersect in exactly two points. 
Prove that the angles of intersection at these two points are the same. (Hint: 
You may want to use a similarity to reduce to the case where C, is the unit 
circle and C2 is a vertical line.) 

e) Let C,, Cz be two generalized circles that intersect at exactly two points 
Z1,Z2 € C. Prove that the angles of intersection at these two points are the 
same. 
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10. 


11. 


12. 


13. 


14. 
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a) Let L be a line and let zg = ov. For any z; € C, prove that there exists a line 
L’ that is tangent to L at zp and which passes through z}. 

b) Prove that for any two distinct points zo, z; € C, there exists a circle centered 
at zo that goes through z). 

c) Let C be a generalized circle and let zo be a point on C. Let z; € C be 
another distinct point. Prove that there exists a generalized circle C that is 
tangent to C at zo and which passes through z1. (Hint: Considering taking 
the inversion through the circle centered at zy and passing through z,. What 
are the images of C,C\, Zo, Z1?) 

d) Finish the proof of Lemma 2.7. 


Let z1, 22, 23, z4 be four distinct points in CP!. Prove each of the following 
assertions. 


a) [21,223 Z4, 23] = 1/[21, 22; 23, Za]. 
b) [z3, 243 21, 22] = [21, 22; 23, Za]. 
c) [21,233 22, 24) = 1 — [21, 223 23, Za]. 


Let z1, 22, 23, 24 be four distinct points in CP!. Let [z1, Z2; 23, Z4] = A. Prove 
each of the following assertions. (Hint: You may want to use the result of Exercise 
2211.) 


a) [Z1, 22; 23, 24] = [22, 21; 24, 23] = [23, 24; 21, 22] = [24, 23; 22, 21] = 4. 
b) [z1, Z23 24, 23] = [22,215 23, Za] = [23, 243 22, 21] = [z4, 235 Z1, Z2] = F- 
c) [21,235 22, 24] = [22, 243 21, 23] =[23, 215 24, Z2]=[2a, 22; 23, Z1J=1 — A. 
d) [z1, 233 24, 22] = [zo, 243 23, 21] = [23, 215 22,24] = [z4, 223 21, 23] = qh. 
e) [z1, 243 22, 23] = [22,233 215 24] = [23, 223 24, 21) = (24, 215 23, 22] = 441. 
f) [z1, 24; 23, 22] = [22, 233 24, 21) = [23, 223 21, 24] = [24, 215 22, 3] = at. 


Prove that [z1, z2; 23, za] = [zs5, z2; 23, z4] if and only if z} = zs. (Hint: You 
may want to use the fact that linear fractional transformations allow you to move 
any three points to any other three points and do not change the cross-ratio.) 
Prove that for any pair of distinct quadruples of points z},..., zgandw1,..., wa 
with the property that [z1, z2; 23, z4] = [w1, w2; w3, wa], there exists a unique 
element g € Mob? (2) such that o(zi) = w; fori = 1,2, 3,4. (Hint: You may 
want to use the result of Exercise 2.2.13.) 
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2.3 PROOFS (Calculus) 


1. 


Assuming that a, b,c, d are complex numbers, c 4 0, and ad — bc # 0, prove 
that 


az+b 

cz+d 

If you have not seen limits of complex numbers before, you can interpret this by 

doing the following change of variables: write z = —d/c + re’? and show that 
az+b 

cz +d | 7 


lim 
z—>-—d/c 


|=. 


r>0 


regardless of the choice of 0. 


. Assuming that a, b,c, d are complex numbers, c 4 0, and ad — bc # 0, prove 


that 
az+b a 
im =o, 
zso0cz+d c¢ 
If you have not seen limits of complex numbers before, you can interpret this as 
follows: prove that 


. aré®+b a 
lim —a = 
rooocre™ +d  ¢ 


regardless of the choice of 0. 


2.4 PROOFS (Group Theory) 


if 


Let G, H be groups with identities 7g and iy, and operations « and o. Let 
g : G — H bea group homomorphism. 


a) Prove that p(tg) = 1H. 
b) Prove that g(a~!) = g(a)7!. 


. Let G, H be groups and let f : G — H bea group homomorphism. Prove that 


f is a group isomorphism if and only if there exists a group homomorphism 
g : H —> Gsuch that (f o g)(x) = x and (go f)(y) = y for all x € H and 
yeG. 


. Let G be a subgroup of H. Prove that the obvious inclusion map 


9:G—->H 


at-a 


is a group homomorphism. 
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Consider Z and G = {even, odd} as groups under addition. Prove that the map 


g:Z—>G 


even if x is even 
n 
odd otherwise 


is a group homomorphism. 

Define PGL(2, C) to be the set of equivalence classes of elements in GL (2, C), 
where two matrices M, Mp? are considered to be equivalent if there exists 2 € C 
such that Mj = 1Mp. 


a) Let Mi, M{,M2,M4 € GL(2,C) such that M;, M; are equivalent and 
M2, M) are equivalent. Prove that Mj Mz, MM} are equivalent. 

b) Using the above, prove that PGL(2, C) is a group. (Hint: The main problem 
is showing that it has a well-defined multiplication on it. The previous part 
suggested how to do this.) 

c) Prove that the quotient map 


g: GL(2,C) > PGL(, C) 
MreM 


is a group homomorphism. 


Prove that Méb? (2) is isomorphic to PGL(2, C). (Hint: Alter the group homo- 
morphism in Theorem 2.4 slightly using your result from Exercise 2.2.5.) 

Let G, H be isomorphic groups. Prove that G is abelian if and only if H is 
abelian. 

Given a set X = {x1,x2,...Xn} of n elements, a permutation is a bijection 
function o : X — X. (Intuitively, o simply permutes the elements of X.) The 
symmetric group on n elements is the set S, of all permutations 0 : X > X. 
Prove that S,, is a group with function composition as the operation. 

Choose four arbitrary complex numbers z1, Z2, Z3, 4 which ones we select is 
virtually irrelevant, but for concreteness let’s suppose that they are 3, 1, 0, oo. 
Consider the set X = {z1, z2, Z3, za} and the symmetric group Sq that shuffles 
around its elements. Let K be the subset of $4 consisting of permutations o such 
that [o (z1), o (22); o (23), 6 (z4)] = [21, Z2, 23, Z4]. Write down the elements of 
K and show that it is a subgroup of S4. (Hint: You may want to do Exercise 
2.2.12 first.) 

In Exercise 2.4.9, to what extent do the four complex numbers that we choose 
matter? Could we select them in such a way that the resulting subgroup of 
permutations is smaller than K? 


® 


Check for 
updates 


Applications of Inversive Geometry 


In which we look at how ancient 
questions can be answered using 
modern tools. 


In Chapters | and 2, we studied the properties of linear fractional transformations; 
I motivated this as a means of understanding certain kinds of geometries. Now is 
a good time to make good on this promise: we are going to see how convenient 
inversive geometry is when attacking various problems that would have given the 
ancient Greeks and later geometers trouble. 


3.1. Steiner’s Porism 


We begin by considering a comparatively modern problem posed by Jakob Steiner in 
the nineteenth century, but which could just as easily have been asked by any of the 
ancient Greeks. The basic setup is as follows: consider two oriented circles C;, C2 
in the plane which are not tangent and whose interiors do not intersect. Classically, 
one of these circles is contained inside the other one (in which case we take the 
orientation of the inner one to be counter-clockwise, and the orientation of the outer 
one to be clockwise). However, we shall see that we can take one of the oriented 
circles to be a line without changing anything substantive. In any case, choose a point 
poncC\. 
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There is a unique oriented circle So that is tangent to C; at p, tangent to C2, and 
whose interior does not intersect the interiors of Cj or C2— this might not be obvious 
at this point, but we will prove it. 


Choose an oriented circle S; which is tangent to C;, C2, and So—there are two 
possible choices. 


2 
®) 


Then there is a unique oriented circle S2 which is tangent to $1, Ci, C2. In fact, 
we can keep going inductively, adding an oriented circle S, tangent to S,-1, C1, C2 
at each step. 


If we require that none of the interiors of these circles intersect, then this process 
will eventually halt, producing what is called a Steiner chain. If the final circle is 
tangent to So, then the chain is said to closed; otherwise, it is open. Above, we have 
drawn an example of an open Steiner chain. Figure 3.1 shows some closed Steiner 
chains differing only in the choice of the initial point. 

There are already various things that we might want to prove rigorously here, to 
show that Steiner chains are well defined. We might want to prove that there is a 
unique oriented circle through p tangent to both C; and C2 which doesn’t intersect 
their interiors. We might want to prove that once we choose a direction, there is a 
unique tangent circle that we can put in at each step of the algorithm. We might want 
to prove that this procedure always halts after finitely many steps. All of these are 
worthy considerations, but these are not Steiner’s porism. Suppose that we fix our 
two circles C,, C2, but we change the point p on C}. This will give us a new Steiner 
chain. What properties does it share with the old chain? Remarkably, if you try out 
some examples, you will quickly discover that it appears that: 
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Fig.3.1 A collection of closed Steiner chains differing only in the choice of the initial point. 


1. the number of circles in the chain does not depend on the choice of point p and 
2. either all chains are open or all chains are closed, regardless of p. 


This is Steiner’s porism, and it is the main result that we will try to prove in this 
section. All of our proofs will be agnostic about the exact configurations of our circles 
and points—we shall always require just that there are two generalized circles C1, C2 
which are not tangent and whose interiors do not intersect. However, we will see that 
one can always assume without loss of generality that C; is a circle inside a circle 
C2, and, in fact, one can request something even stronger. Before we get into that, 
though, let’s take some time to properly convince ourselves that Steiner chains are 
well defined, via a sequence of lemmas. 


Lemma 3.1 Let C,, C2 be two oriented circles that are not tangent and whose 
interiors do not intersect. For any point p on C\, there exists a unique oriented 
circle C3 tangent to C, at p, tangent to C2, and whose interior does not intersect 
the interiors of C, or C2. 


Proof Recall that linear fractional transformations preserve both angles and gener- 
alized circles but allow us to move any three points in C P! to any other three points; 
therefore, if we choose two points p’, p” on Cj aside from p, there exists a unique 
y € Méb°(2) such that g(p) = 00, g(p’) = 0, g(p”) = 1, and the image under 
g of C, is the real line. By rotating if necessary, we can assume that the interior 
of the image of C; lies below the real line and therefore C2 is a circle lying above 
the real line with a counter-clockwise orientation. Note that if the statement of the 
lemma is true of this new configuration, then it has to be true of the old configuration. 
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Fig. 3.2 On the left, an illustration of the circle configuration constructed in Lemma 3.1; C;, Cz 
are shown in purple, and C3 is shown in blue. On the right, an illustration of the configuration 
constructed in Lemma 3.2; C;, C2 are shown in purple, C3 is shown in blue, and C4, Cs are shown 
in green. 


Therefore, we can assume without loss of generality that we started with this con- 
figuration. If C3 is tangent to C, at p = oo, then C3 must be a line; more precisely, 
a line parallel to the real line. It is easy to see that there are two lines tangent to 
C2 which are parallel to the real line. However, only one of them can be given an 
orientation such that its interior does not intersect the interiors of C, or C2. This is 
illustrated in Figure 3.2, where the original two generalized circles are drawn with 
their interiors in purple, and the only oriented circle satisfying the desired conditions 
is drawn in blue. The other tangent line is dashed. oO 


This is a good place for an important aside: note that the crucial idea behind the 
proof of the previous lemma was to move the given configuration of circles to a 
standard one. In so doing, we simplify: we start with a configuration that might be 
difficult to analyze and work with, but then by using a linear fractional transformation, 
we can reduce to a particular case that is easy to think about instead. This fundamental 
insight is captured in the following philosophical statement, which we will be using 
implicitly throughout this chapter. 


Philosophical Principle 


Given a difficult geometry problem involving circles, lines, and angles (but not 
necessarily distances), see if you can use a Mobius transformation to transform a 
complicated configuration into a simple one where the answer is easy to see. 


Lemma 3.2 Let C,, C2 be two oriented circles which are not tangent and whose 
interiors do not intersect. Let C3 be an oriented circle tangent to both of them, but 
such that its interior does not intersect theirs. There exist exactly two oriented circles 
C4, C5 that are tangent to C,, C2, C3 with an orientation such none of the interiors 
of C1, C2, C3, C4, Cs intersect. 


Proof Let p be the point at which C3 is tangent to C;. Using a linear fractional trans- 
formation, we can move p to oo. By applying rotations, translations, and dilations, 
we may assume that Cj is the real line whose interior is below the real axis; C2 is 
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a circle above the real axis with a counter-clockwise orientation with center (0, yo) 
and radius 1; and C3 is a line y = yo + | tangent to C2 whose interior lies above it. It 
is plainly obvious that any circle with center (xo, (vo + 1)/2) and radius (yp + 1)/2 
will be tangent to both C; and C3; by changing xo, we can arrange for this circle 
to be tangent to C2 in one of three ways, two of which are illustrated in Figure 3.2. 
(The third way would have the interior of this circle intersect one the interiors of at 
least one of C), C2, C3.) This assertion can be proved formally, although at present 
we lack the tools to prove it elegantly—this will be rectified later in this chapter. 
However, it can still be done by solving the set of equations 


2 2 
— — yori _ (yo+1 
ana (y- BEN 2 (ms 


x*+(y— yo)? =1 


where the solutions are points (x, y) that lie both on C2 and our new circle. This is 
something of an algebraic mess, but it resolves to 


—2x5 (yo — 1) £ (yo — I) xg (—x3 + 2y0 +2) +. 2x8 
xo (4x6 + Go — 1”) 

x3 Gyo + 1) + 2y/x8 (—x2 + 290 +2) + 0 — 1? Oo + 

Axo + (yo — 1)2 


The intersection point should be unique, which means that we need x (—x$ +2yo+ 
2) = 0. If x9 = O, then the interior of our new circle intersects either C2 or C1 
and C3. The other possibility is that xo = -./2(y0 + 1), whence our two solutions. 
Giving our new solutions the counter-clockwise orientation, we are done. oO 


eS 


y= 


Thus, we conclude that our intuitive definition of Steiner chains is perfectly valid 
regardless of the circles C, C2 that we choose. It remains to attack Steiner’s porism 
itself. The methods we use are essentially the same as for the lemma. 


Theorem 3.1 (Steiner’s Porism) Let C,, C2 be two oriented circles that are not 
tangent and whose interiors do not intersect. All Steiner chains starting with these 
two circles have the same number of circles and are either all open or all closed. 


Proof The key observation is that if we apply an inversive transformation to a Steiner 
chain, then the result will also be a Steiner chain. Furthermore, this new Steiner chain 
will have the same number of circles as the original and will be closed if and only if 
the original was closed. Thus, we can use inversive geometry to reduce the general 
case to a simple case that is easy to work with. The visual picture for this is given in 
Figure 3.3. First, we note that we can always assume that both C, and C2 are circles— 
if not, simply choose a point zo that does not lie on either C; or C2, and apply the 
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SO ORE 


Fig.3.3 A visual sketch of the proof of Steiner’s porism. (a) shows the starting configuration of two 
circles; (b) shows the image of this configuration, put into standard position; (c) shows a completed 
Steiner chain in this standard position; (d) shows how this Steiner chain lifts back to the original 
configuration. 


transformation z > 1/(z — zo). Second, we can assume that C, is contained inside 
C2—if not, simply take a circle inversion through C2. This reduces to the classical 
setting in which Steiner’s porism is usually presented. By applying a sequence of 
translations, rotations, and dilations, we can assume that C2 is the unit circle and 
that the center of C, lies on the real axis. The final step is to apply a linear fractional 
transformation that fixes C2 but moves C, to a circle that is concentric with C2; the 
only difficulty is proving that such a transformation exists. 

There are a number of different ways to prove this; we will take a hybrid geomet- 
ric/analytic approach. First, note that the real line passes through the centers of both 
C, and C2, and therefore is perpendicular to both of them. Find any other general- 
ized circle C3 that is perpendicular to both C; and C2; the most convenient choice 
is to have C3 perpendicular to the real line as well. Here is one way to show that 
such a circle exists: for any real number r, there exists a unique generalized circle 
perpendicular to both the real line and C2 that passes through r. (See Exercise 3.2.1) 
If we take r to be inside C1, then this circle intersects C; at some angle 0 which 
ranges from 0 to z. Since this function r +> @ is continuous, by the intermediate 
value theorem, there is some r where 9 = 2/2 exactly—that is, the circle we have 
chosen is perpendicular. Let p be the point where the real line intersects with C3 
and let g € M6b?(2) such that g(—1) = —1, g(1) = 1, and g(p) = 0. The image 
of the real line under @ is itself. The image of C2 is a circle that is perpendicular 
to the real line at —1 and 1—however, there is only one such circle, and that is C2, 
the unit circle. The image of C3 is a circle that is perpendicular to the real line and 
C2 and passes through 0—there is only one such circle, and that is the vertical line 
x = 0. Thus, C| is a circle that is perpendicular to y = 0 and x = 0—this is true 
if and only if the center of C; is 0. (See Exercise 3.2.2) Thus, we have shown that 
Steiner’s porism is true in general as long as it is true of concentric circles. However, 
for concentric circles, it is obvious that Steiner’s porism is true—first of all, we can 
use a rotation to move the starting point for the chain onto the positive real axis; 
secondly, we can use a reflection if necessary to switch between the two possible 
choices of circle in the second step of the construction. Therefore, all Steiner chains 
have the same number of circles, and they are either all closed or all open. a) 
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Fig.3.4 For any three mutually tangent circles in these drawings, you can see that there are precisely 
two circles tangent to all three of them. 


3.2 Apollonian Gaskets 
In the third century BCE, Apollonius of Perga asked the following question. 


> Question Consider three circles in the Euclidean plane. How many circles exist 
that are tangent to each of these three circles simultaneously? How might one find 
such circles? 


From the writings of other Greeks, we know that Apollonius gave a solution to 
this problem; sadly, the details of how he did it are lost to history. On the other hand, 
we might wonder how we might attack this using inversive geometry by reducing the 
general problem to a simpler one using linear fractional transformations to move the 
circles into a standard configuration. However, there is still some splitting into cases 
that must happen, because it matters if the three initial circles intersect or not. We 
shall consider a special case of Apollonius’ problem which has been of the greatest 
interest to modern mathematicians. 


> Question Consider three circles in the Euclidean plane such that any two circles 
are tangent to one another. How many circles exist that are tangent to each of these 
three circles simultaneously? 


We will see that the answer to this question is quite simple: there are always 
exactly two such (generalized) circles. The proof of this is not at all complicated and 
comes from the following slight generalization. 


Theorem 3.2 Let C;, C2, C3 be three distinct, mutually tangent generalized circles 
in CP!. Then there exist exactly two generalized circles C4, C’, that are tangent to 
each of C1, C2, C3. 


Remark 3.1 Some concrete examples of this result are shown in Figure 3.4. 
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Proof Let z1, z2, z3 be the points where C;, C2, C3 are tangent. Use a linear frac- 
tional transformation to send z; +» ©, z2 +» 0. What does our configuration of 
generalized circles now look like? Well, two of them pass through 00, so they must 
be lines—since they are tangent at oo, they must be parallel lines. Furthermore, one 
of them passes through 0. Using a rotation and dilation if necessary, we can move 
this configuration further so that one circle is the real line and another is y = 1. The 
final generalized circle is a circle that is tangent to y = 0 at 0 and is tangent to y = 1. 
What is this circle? Well, since it is tangent to y = 0 at 0, it must be of the form 


x* + (y — yo)” = 9 
for some yo € R. There is only one such circle that is tangent to y = 1, and that is 
x* + (y — 1/2)? = 1/4. Note as well that, more generally, any circle that is tangent 
to both y = 0 and y = 1 must be a translation of this circle—this is because we can 


always translate the point of intersection at y = 0 to 0 if need be. Therefore, any 
circle that is tangent to the images of C;, C2 must be of the form 


(x — xo)? + (y — 1/2)? = 1/4. 


There are exactly two such circles that are tangent to the image of C3, and those occur 
if x» = —1 or xo = 1. However, since we moved C), C2, C3 by linear fractional 
transformations, the number of tangent circles could not have changed. oI 


There is a useful corollary of this result that applies to Descartes configurations. 


Definition 3.1 We say that four generalized circles are a Descartes configuration if 
any three of them are mutually tangent to one another. 


Corollary 3.1 (Existence and Uniqueness of Descartes Swaps) For any Descartes 
configuration C,, C2, C3, C4, there is a unique circle C4 such that C1, C2, C3, C4 is 
also a Descartes configuration. 


Proof Note that, by definition, C1, C2, C3 are three mutually tangent circles and 
so, by Theorem 3.2, we know that there exist exactly two circles that are mutually 
tangent to each of C;, C2, C3. One of them must be C4; the other one is Cis It is easy 
to see from the definition that C;, Co, C3, C 4 is a Descartes configuration. oO 


Exchanging C4 for C/, is often called a Descartes swap. For any Descartes con- 
figuration, there are exactly four corresponding Descartes swaps, coming from the 
four different choices of circle that we can swap out. One might well ask whether 
there is some kind of geometric interpretation that we can give to these swaps, and 
indeed there is. 


Theorem 3.3 (Geometric Interpretation of Descartes Swaps) For any Descartes 
configuration C1, C2, C3, C4, the Descartes swap 


(Ci, Co, C3, C4) > (Ci, Co, C3, C4) 
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Fig.3.5 Two examples of Descartes swaps. 


is given by an inversion through the unique circle that passes through the intersec- 
tion points of C,, C2, C3. Furthermore, this aforementioned circle is orthogonal to 
C1, C2, C3. 


Proof The easiest way to prove this is to reduce to the standard Descartes config- 
uration that we had previously, where C1, C2 are the lines y = 0 and y = 1, C3 is 
the circle with center i/2 and radius 1/2, and C4 is the circle with center 1 + i/2 
and radius 1/2. Then C4 is the circle with center —1 + i/2 and radius 1/2; we 
see that that is exactly the image of the reflection through the line x = 0, as illus- 
trated in Figure 3.5. This reflection indeed passes through the intersection points of 
C1, C2, C3, and is orthogonal to those three circles. Finally, this reflection does not 
move C1, Cz, C3, so indeed it moves the Descartes configuration C;, C2, C3, C4 to 
the Descartes configuration C), C2, C3, Ch as desired. oO 


With this result in mind, we make a definition. 


Definition 3.2 For any Descartes configuration C;, C2, C3, Ca, the collection of 
circles that are orthogonal to triples of C1, C2, C3, C4 are known as the dual circles. 
The collection of all transformations in Mob(2) that can be written as a composition 
of reflections through these dual circles is known as the Apollonian group. (The 
reader should check for themselves that this is indeed a group—see Exercise 3.3.1.) 


We will give a more algebraic description of the Apollonian group later in this 
chapter. For now, we content ourselves with exploring its geometric significance, 
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Fig.3.6 The iterative construction of an Apollonian gasket. We start with a Descartes configuration 
in (a), drawing the dual circles in red. In (b), we add all Descartes swaps of the initial configuration 
through the dual circles. In (c), we add all Descartes swaps of the new circles, and in (d) we repeat 
this process again. 


which comes from Apollonian gaskets. Notice that since Descartes swaps move 
Descartes configurations to Descartes configurations, we could iterate the process, 
doing it over and over again. If we do this ad infinitum, the result is what is known 
as an Apollonian gasket, which is shown in Figure 3.6. 


Definition 3.3 Let C,, C2, C3, C4 be a Descartes configuration. The Apollonian 
gasket with starting configuration C,, C2, C3, C4 is the smallest set S of generalized 
circles in CP! such that 


1. S contains C,, C2, C3, C4 and 
2. if Ci}, Ch, C4, C4 are all circles in S that form a Descartes configuration, then all 
of the Descartes swaps of these circles are also in S. 


Remark 3.2 Such configurations are also sometimes called Leibniz packings as they 
were first described by the mathematician Gottfried Leibniz in a letter to de Brosses 
in the seventeenth century. 
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Fig.3.7 The standard Apollonian gasket. 


Strictly speaking, this is defined in a way that is a little different than how we 
described it initially, but itis not hard to see that these descriptions are equivalent. (See 
Exercise 3.2.4.) However, our definition is easier to work with. We begin by noting 
that Apollonian gaskets are all essentially the same. Call the Descartes configuration 
with circles y = 0, y = 1, x? + (y/2)* = 1/4, and (x — 1)° + (y/2)? = 1/4 
the standard configuration—the corresponding Apollonian gasket is illustrated in 
Figure 3.7. 


Theorem 3.4 Let C,, C2, C3, C4 and D,, D2, D3, D4 be two Descartes configura- 
tions. Let A,, Az be the corresponding Apollonian gaskets. If p € Méb° (2) is such 
that o(C;) = D; fori = 1, 2,3, 4, then o(A,) = Ao. 


Proof It is not hard to see that linear fractional transformations preserve Descartes 
swapssincethey preservetangencies. Therefore, g (A ;)isasetof generalized circlesthat 
contains D;, D2, D3, D4andsuch that for any quadrupleinthe set, all oftheir Descartes 
swaps are also in the set. The set y(A ) must be the smallest set with this property—if it 
were not, then g~!(A2) would be a proper subset of A, closed under Descartes swaps 
and containing C;, C2, C3, C4. However, this would violate the definition of A, as the 
smallest set with that property. We conclude that p(A1) = A2. oO 


Corollary 3.2 Let A be an Apollonian gasket with initial configuration C,, C2, C3, 
C4. There exists 9 € M6éb°(2) such that p(A) is the Apollonian gasket of the standard 
configuration. 


Proof We already know that any Descartes configuration can be reduced to the 
standard configuration by linear fractional transformations. The rest follows from 
Theorem 3.4. oO 


3.3 Inversive Coordinates 


Thus far, our computations of the images of generalized circles under linear fractional 
transformations have been inefficient: we have had to compute equations describing 
those circles, calculate how those equations transform, and then finally work out what 
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is the new corresponding generalized circle. While this is certainly not impossible 
to do, there is a substantially faster way via inversive coordinates, which will aid us 
in our investigations into Steiner’s porism and Apollonian gaskets. 


Definition 3.4 For any oriented circle C, let «(C) be the bend of C—that is, 


if C is acircle with radius R, oriented counter-clockwise 


| Bln 


$ if C is a circle with radius R, oriented clockwise 


0 if C is a line. 


x(C) = 


The co-bend of C—denoted x’(C)—is the bend of the image of C under the map 
zt» —1/z. The bend-center of C is denoted by €(C) and 


K(C)zo_ if C is acircle with center zo 


o(C) = fe: 


Together, (x(C), «’(C), €(C)) are the inversive coordinates for C. 


if C is a line traversed in the direction e!?. 


Remark 3.3 It is more common in the literature to refer to the bend, co-bend, and 
bend-center as the curvature, the co-curvature, and the curvature-center—see [17], 
for example. Furthermore, it is more typical to see the co-bend defined in terms of 
1/z rather than —1/z. My reason for departing from these conventions is very simple: 
one can generalize all these definitions for higher dimensional spaces and study, for 
instance, oriented spheres in RU {oo}. (I did exactly that in my thesis [14].) In that 
case, you want to define inversive coordinates exactly as I have here—e.g. the bend 
should be +1/R where R is the radius. However, the usual definition of curvature for 
a sphere is +1 /R7. Similarly, while 1 /z is orientation-preserving as a transformation 
on the plane, as a transformation on R°, itis orientation-reversing! On the other hand, 
—1/z is still a perfectly good orientation-preserving transformation. 


The inversive coordinates of an oriented circle specify it uniquely. 


Lemma 3.3 The map 


inv: {oriented circles in cP'} —> R* 
Cr (K(C), x'(C), RE(C)), F(E(C))) 


is injective and inv(—C) = —inv(C) for all oriented circles C. 


Proof Suppose that inv(C;) = inv(C2). Since «(C,) = x(C2), either they are both 
circles or both lines. If they are both circles, then they have the same radius, center, 
and orientation since «(C;) = «(C2) and €(C,) = &(C2). If they are both lines, the 
fact that €(C1) = €(C2) tells us that they are parallel. Since they are parallel, we 
can find a line through the origin that is perpendicular to both of them. Now, what 
is the image under the transformation z + —1/z? Well, the line through the origin 
will still be a line through the origin, which will still be orthogonal to the images 
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of Cy and C2, which will now both be circles passing through 0O—indeed, since C 
and C2 were tangent at oo, these circles must be tangent at 0. Since both of these 
lines are orthogonal to a line through the origin, their centers must lie on this line. 
Since x’(C,) = x’(C2), the radii of these circles must be the same, so they must 
either coincide or be reflections across a line normal to both of them; furthermore, 
they must have the same orientation. However, the second case is impossible due 
to orientation considerations: under the transformation z +> —1/z, the two circles 
will map to lines pointing in opposite directions. Therefore, C; = C2. Proving that 
inv(—C) = —inv(C) is left as an exercise to the reader. (See Exercise 3.2.5.) oO 


A curious fact is that the inversive coordinates of any oriented circle always lie 
on the surface of a hyperboloid. 


Theorem 3.5 Let C be an oriented circle with inversive coordinates (x, x’, €). Then 
—KK' +|é? =1. 


Proof If C is a line, then « = 0 and € is a unit vector, and so the claim follows 
immediately. Otherwise, we note that since inv(—C) = —inv(C), we may assume 
without loss of generality that C is a circle with positive orientation—in that case, we 
know that if C has center zp and radius R, then « = 1/R, € = zo/R. Note that if we 
rotate C around the origin by @ radians, then this won’t change the bend or co-bend, 
and will merely change ¢ to e’?é—thus, this will not change the value of —K«'/+|¢|?. 
Thus, we may assume that zo > 0. If z = 0, it is easy to check that x’ = —R, so it 
remains to consider the case where zo > 0. We know that our circle together with 
its interior will be the set of points satisfying |z — zo| < R7; its image under the 
map z +> —z~! will therefore be the set of points satisfying | — z~! — zo| < R. 
Equivalently, this is the set of points 1 +2z)R(z)+ [z|*24 < |z|?R?. If zo = R, then 
this is the equation of a half-plane, hence x’ = 0. However, then € = zo/R = 1, and 
so —kx«’ + |é|/? = 1. If zo # R, then instead we can complete the square, yielding 


2z ze Zz 
(22 = R’) [z\? + SRW) + 0 __ +1< = 
oe (zo — R’) age 
2 
2 
2 2 £0 R 
zo — R°) |z+ = 
0 zo — R? ze — R? 
whence 
2 2: 
Z0 R 
t+ ee = (3 R2)" ees 
2 
0 R2 
2+ 3m] = emp NO <8 
Therefore, x’ = (a — R*)/R and consequently 
1 3-R ab 
! 2 0 0 
_ + = . —! 1, 
kK + |¢| R R R2 


exactly as claimed. oO 
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There is another way to write down inversive coordinates that is a little more 
useful for our purposes. Specifically, suppose that we have an oriented circle C and 
we write the matrix 


M(C)= (aS ) 


o(C) K(C) 
This captures the same information. Moreover, it is easy to see that det(M(C)) = 
«(C)x'(C) — |E(C)|?. As a consequence, we have the following observation. 


Theorem 3.6 There exists a bijective map 


inv: {oriented circles in CP'} > {1m € Mat(2, om = M', det(M) = -1 


K(C) EC) 
ore (aS ) 


where Mat(2, C) denotes the set of all 2 x 2 matrices with complex coefficients and 


vi denotes the conjugate transpose—i.e. 
ab —T ac 
m=(Ch)om = (§5). 


Remark 3.4 It is technically an abuse of notation to call this map inv as well, but 
I think it is acceptable since it will always be clear from context whether we are 
thinking of inversive coordinates as vectors or matrices. 


Proof Notice that M = M' if and only if 


_ a x-+iy 
7 ae b ) 


for some a,b, x, y € R; by this and Theorem 3.5, we see that the defined map is 
well defined. That it is injective follows immediately from Lemma 3.3. It remains to 
prove surjectivity, which isn’t too hard. Consider a matrix 


w= (275°). 
x—iy b 
If b = 0, then we know that |x +iy| = 1,so x +iy =e! for 6 € R. Let C be the 
line traversed in the direction —ie’? and passing through the point ae’? /2. Then one 
can check that inv(C) = M. If b ¥ 0, then there exists a circle C with bend b and 
center (x + iy)/b. The co-bend x’ of this circle must satisfy —bé + |x + iy|? = 1, 
which means that € = a. Thus, inv(C) = M, and we are done. oO 
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> Example Let C bea line with inversive coordinates (0, x’, €). Find a parametric 
equation for C in terms of x’ and é. 

We know that C is traversed in the direction —i¢ by the definition of the bend-center, 
so it suffices to find a single point on C. Since we know that € points in the direction 
of the interior of C, we know that sé € C for some real s. The image of C under 
z ++ —1/zhas to contain both 0 and —1/(sé) = —é/s, as these are the images of 00 
and sé. The line L through the origin in the direction € passes through both of these 
points and is orthogonal to C at the intersection point f¢; its image under z +> —1/z 
is also a line orthogonal to C at 0 and —é/s. This is easiest to see with a diagram. 


We conclude that if s 4 0, then the image of C is a circle and 0 and —é/s are 
diametrically opposed. Therefore, the radius of C is |€/s|/2 = 1/(2|s|). Reasoning 
out where the interior must be, we see that actually x’ = 2s; in fact, this still 
holds true even if x’ = 0. We conclude that the point on C is sé = x’é/2 and so 
z = (k'€)/2 —ité is a parametric equation for C, where t € R. 


> Example Suppose that C is a circle with positive orientation and tangent to the 
real line at the origin. Determine the possible inversive coordinates of C. 

If C is tangent to the real line at the origin, then its center must be of the form fi 
for some real t # 0, and its radius must be |t|. Therefore, the bend is 1/|t| and the 
bend-center is sgn(t)i, where sgn(t) is the sign of f; that is, itis 1 if t > O and —1 if 
t < 0. It remains to determine the co-bend x’. However, we know that 


L=—xr' + |é? = —|t\x’ +1, 


so x’ = 0. We could also have seen this geometrically: since C passes through 0, its 
image under z +> 1/z is a line. Either way, the inversive coordinates are (t, 0, +i) 
for some t > 0. 
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3.4 The Special Linear Group 


Before we proceed to show how to use inversive coordinates for fast computations, 
we need to introduce another player, which might initially seem unrelated, but will 
eventually be very useful. 


Definition 3.5 The special linear group on C*, denoted by SL(2,C), consists of 
all 2 x 2 matrices with complex coefficients with determinant |. That is, if 


ab 
M= (: ) € SL(2,C), 


then det MW = ad —bc = 1. 


There are a few things we will need to know about the special linear group in 
order to proceed. To start with, it is actually a group. 


Theorem 3.7 The set SL(2, C) is a group if we take matrix multiplication to be the 
operation. 


Proof In the course of the proof of Theorem 2.3, we showed that matrix multi- 
plication is associative, the identity matrix satisfies the properties of an identity, 
and that det(M,M2) = det(M;) det(M2) for any matrices M;, M2. Therefore, if 
M,, Mo € SL(2, C), then det(M, Mz) = det(M,) det(M2) = 1, which is to say that 
M,M2 &€ SL(2, C). The only thing that remains is to check that SL(2, C) contains 


inverses. Indeed, if 
ab 
( ‘) € SL(2,C), 


=] 
( = Ge 7 € SL(2,C) 


since the determinant is da — (—b)(—c) = ad — bc = 1. Oo 


then 


Theorem 3.8 Every 9 € Méb°(2) can be written in the form 
az+b 
a cz+d 


for some a,b,c,d € C such that 
ab 
(: 4 € SL(2,C). 


Proof We know that any linear fractional transformation can be written in the form 
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az+b 
éz+d 


g(z) = 
such that 
M = € 5) 2GLC.O), 
cd 


However, since det(M) # 0, there exists some 2 € C™* such that 4? = det(M). 
Define a = 4/1,b = b/i,c =C¢/4,d =d/d. Then 


az+b  az+b 
éztd  cztd’ 


g(z) = 


but on the other hand if 


_ [(ab\ _ (ab\(i/r 0 
m= (0) =a) (0 a) 
then det(M) = det(M)/22 = 1,s0 M € SL(2,C). Oo 


There are many cases where it is more convenient to associate a matrix in SL(2, C) 
to an element of g rather than the more general case of a matrix in GL(2, C), even 
if it requires a little extra work to renormalize the determinant to 1. One reason why 
this is nice is that there are infinitely matrices in GL(2, C) that correspond to any 
single gp € Mob? (2), but there are only two matrices in SL(2, C) that correspond to 
any given @. 


Theorem 3.9 Suppose M,, M2 € SL(2,C) are both matrices that correspond to 
the same linear fractional transformation under the map ¥ defined in Theorem 2.4. 
Then M, = £M>. 


Proof Consider the matrix M = M\M;' € SL(2, C)—write this as 


ab 
M= (: ‘) | 
Since ¥(M,) = (M2), we know that ~(M\M;') = Y(M,)¥(M2)~! = 1, the 
identity function. Therefore, we know that 
az+b 
=Z 
cz +d 
for all z € CP!. This immediately implies that c = 0—this is because 1(00) = 00 


but ¥(M)(co) = a/c if c € 0. It also means that b = O since 1(0) = O and 
Y(M)(0) = b/d. Ergo, 


a0 
M= € i) € SL(2,C), 
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Fig. 3.8 The illustration on the right is the image of the circles on the left under the action of 
Ore € SL(2,C) 

but since det(M) = ad = 1, we can simply take a = 1/d. This means that 
(M)(z) = a’z, and this is the same as 7 if and only if a = +1. Thus, M = +/, 
where J is the identity matrix. But this means that M\M;' =+/,orM,; =+M).0 


Another benefit of SL(2, C) is that there is a simple way that it moves around 
inversive coordinates. Specifically, choose any y € SL(2,C) and any matrix M € 


Mat(2, C) such that det(M) = —1 and M' = M. If we consider N = y My", then 
we notice that 


1. det(N) = det(y ) det(M) det(7") = —1 and 
— = 
2.N =7 M7? =ymyp =N, 


where we have used the fact that the conjugate transpose does not change the deter- 
minant and that it reverses multiplication. (See Exercise 3.2.6.) This justifies the 
following definition. 


Definition 3.6 Let C be an oriented circle and y € SL(2,C). By y.C, we shall 
denote the unique oriented circle C’ such that inv(C’) = yinv(C)7". 


This is a particular example of something known as a group action—a way that 
we can use a group to move around elements of some sets. To help illustrate what 
is going on, Figure 3.8 shows the effect of this particular action on a collection of 
lines, and how they get mapped to circles. 

As most groups we have dealt with have been groups of transformations, group 
actions are not exactly new to us. (Although a formal definition and further examples 
are relegated to the exercises—see Exercise 3.3.4.) However, this is the first example 
that we have seen where there are ostensibly two different group actions on the same 
space: on the one hand, we know that elements of SL(2, C) correspond to Mobius 
transformations which we know move around oriented circles; on the other hand, 
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we just came up with this new action by which we can move around oriented circles 
by thinking about their inversive coordinates instead. As it happens, both of these 
actions are secretly one and the same. 


Theorem 3.10 (Equivalence of Matrix and Linear Fractional Linear Actions) 
Let C be an oriented circle and let y € SL(2, C). Then y.C = ¥(y )(C), where Y 
is the usual map from GL (2, C) to M6b° (2). 


Proof To simplify the proof, we first note that every element in SL(2, C) that can 
be written as a product of matrices of the following types: 


(oa) aa) 


A proof of this can be easily adapted from the proof of Theorem 2.5, and is left as an 
exercise to the reader. (See Exercise 3.2.7.) Notice that if we can prove the theorem 
for these basic types of matrices, then we will in fact have proved it for all elements 
of SL(2, C). Why is this? Well, it is easy to check that y1.(y2.C) = (y1y2).C. (See 
Exercise 3.3.7.) We already know that ‘P(y;y2) = Y(y1) o Y(y2). Therefore, if 
we can write y = y,y2...Y, for some y; € SL(2,C) for which we know that 
y;.C = ¥(y;)(C), then it follows that 


y.C = y1.(y2-¢...9n-C)...) 
= yi. (72... PQ) (C))--) 
= (¥(71) 0... 0 P(yn)) (C) 
= P(y1y2---Yn)(C) = Py (C). 


Now, let’s look at what each of these basic matrices do, in turn. Let C be an oriented 
circle with inversive coordinates (x, x’, &). First, 


re? x’ E\ (re? 0 ~ rePn! relbé re? oO 
0 te? EK 0 tel? a te WE tex 0 tel? 


so if 


zh rve'’z, Well, if C is a circle with bend x and center ¢/x, then its image under 
y will be a circle with bend x/r? and center r7e?!? /x. If C is a line, then its image 
will be rotated by e?’? and so will its bend-center; the co-bend will be scaled by r?. 
Therefore, y.C = ¥(y).C in this case. Next, 
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Lt) (e’ E\ (10) _ («+ 7& E+ cK) (10 

O11) Ex) \T1) é K tT 1 
— («' +2R(E) + |r)? E+Kr 
= E4AKT K : 


so if 


then inv(y.C) = («, «x! + 2R(cé) + |t|?«, € + x7). On the other hand, ¥(y) is 
the transformation z +> z+ 7. If C is a circle with bend x and center ¢/x, then its 
image will be a circle with bend x and center €/k + t = (€+ x«t)/x. If C isa line 
in the direction —ié and passing through the point «’¢/2, then its image will be a 
line in the direction —ié and passing through the point x’€/2 + 7. More relevantly, 
this line will pass through the point «’é/2 + R(t2)E = (k’ + 2R(tE))E/2 since 
® (ré)é is the projection of t onto the ray in the direction of €. In any case, we see 
that y.C = Y(y).C. Finally, 


(Co) EMG o)=(2)00) 


then inv(y.C) = («’, x, —é). On the other hand, /(y ) is the transformation z > 
—1/z, which we know exchanges x and x’ simply by the definition of the bend and 
co-bend. Since —«x’ + |é|? = 1, we only need to determine the direction of the 
bend-center; it is not hard to see that if the bend-center of C is ¢, then the image of 


C must have bend-center in the direction of —€. Thus, y.C = (y).C, and we are 
done. oO 


This is fantastic news: it means that if we know the inversive coordinates of an 
oriented circle, then it is easy to compute its image under any MO6bius transformation. 


> Example Prove that if y € SL(2,C) has real coefficients and C is the real line 
oriented so that i is in its interior, then y.C = C. 

It is easy to check that the inversive coordinates of C are (0,0, i). Therefore, the 
inversive coordinates of the image are given by 
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0 i\_r 
ro a)r". 
_ [ab 
Y= Ned 


for some a,b, c,d € R such that ad — bc = 1, we calculate directly that this is the 


same as 
0 (ad—bc)i\ (O01 
—(ad — be)i 0 ~ \-i oy’ 


which we note are just the inversive coordinates of C. 


Writing 


3.5 Inversive Distance 


We need one final computational tool before we can revisit the examples we looked 
at earlier, and that is the inversive distance of oriented circles. 


Definition 3.7 Let C;, C2 be oriented circles with inversive coordinates (x, Ky »¢1) 
and (x2, «4, €). Their inversive distance is 


KK, + Kok} 


(Cy, C2)1 = 5 


— RE). 


Confusingly, inversive distance can be negative. Another term that sometimes 
appears in the literature for this metric is “Pedoe product.” However, this seems to 
be a case of Stigler’s law of eponymy!, since the notion of inversive distance was 
already discussed by Coxeter in 1966 [2] and is likely much older, whereas Daniel 
Pedoe didn’t write about it until 1970. 

There are many equivalent ways to write the inversive distance. For example, if 
we write ¢) = x; + yi and é2 = x2 + yi, then we can write the inversive distance 
as 


0% 0 0 K2 
+00 0 Ke 
- ! 2 2 

(C1, C2)1 = (#1 «x1 1) 00-10 x2]? 
00 0 -1 y2 


which makes it easy to see that (Cj,C2)7 = (C2,C1); and that (C;,-—-C2) = 
—(Cj, C2). Another way to write the inversive distance is 


' This was an observation by statistician Stephen Stigler: no scientific discovery is named after its 
discoverer. Stigler attributed this law to sociologist Robert Merton, but it is likely far older. 
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La) (62) 
(Cj, C2) = s((# |) (2 °) ) 


where tr(M) denotes the trace of M—that is, if 


Q1,1 41,2... Alin 
42,1 42,2 ---42n 


Qn,1 4n,2 -++ Ann 


then tr(M) = a1 +a22+...+4n,n- 


Theorem 3.11 (Invariance of Inversive Distance) Let C,, C2 be oriented circles 
and let y € SL(2, C). Then (C1, Co)7 = (y.Cy, y C2) 1. 


, -1 
‘4 Ky 62\ 7 
(° (2 ) : ) ) 
i Kk o1\—r (—r\~! (5 o2 poe 
~ s(> (3 5) 7" (7’) (© |) : ) 
i MENG 24 
~ s(> (2 ?) (2 °) ) 


Here, we use a general property of the trace: for any n x n matrix M and G e€ 
GL(n, C), tr(M) = tr((GMG7!). (See Exercise 3.2.8.) Therefore, 


1 (ea) (xe) \_ 
(y.C1, y.C2)7 = (2 5) (2 > )- (C1, C2)7 


as desired. oO 


Proof Notice that 


> 
oO 
~~ 
ie) 
Le 
~ 
ll 
Nie 
ar 
ind 
me 
~~ 
SN 
a ae 
& oy 
—— ——*” 
S| 


The fact that the inversive distance is invariant under MObius transformations 
certainly signals its importance. Even so, we would like to have a more geometric 
interpretation for what it actually means. As it happens, this is also possible. 


Theorem 3.12 (Geometric Interpretation of Inversive Distance) Let C,, C2 be 
oriented circles that are not lines, with positive orientation and radii r, and rz. Let 
d be the distance between their centers. Then 

2 2 2 
d* —ry —1ry 


Cy, Co), = 
(Cy, Co)7 arr 
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Fig.3.9 Two possible ways that two circles can fail to intersect. 


Proof Let (x1, Ki » 1), (ka, Kh, &2) be the inversive coordinates of C, and C2. Then 
y= 1/1, Q> 1/k2, and d = |€) /Ky = &2/k2|. Therefore, 


a? —r2 =r} |Ei/er = So/re2l? = xp = 1/x3 
oe 2/(«1K2) 


lxo€ — 1G, |? — Kf — KF 


2K|K2 
Ke = KZ + SIE? + KZ? — 2e1HQR(ES) 
2K K2 
2 2_4)442 2_] _ 
at (I¢2| ) + «3 (11 ) RES) 


2K K2 


2 / 2 4 

_ Ky (x2K5) + K3 («1 K}) & 

= _ R(E1é2) 

= ae — REG) = (Ci, Ca)s, 


as was claimed. oO 


Corollary 3.3 Two oriented circles C,, C2 intersect if and only if |(C,, C2)7| < 1. 


Proof The inversive distance is invariant under Mobius transformations, so we can 
assume without loss of generality that neither of C;, C2 is a line. Furthermore, since 
(C1, —C2)| = |(C1, C2)| = |(—C1, C2)|, we may assume that both C; and C2 have 
positive orientation. Since (C;, C2); = (C2, C1), we can assume that the radius of 
C, is at least as large as the radius of C2. In that case, notice that these two circles 
intersect if and only ifr; — r2 < d <r; +12. The two different ways that circles 
can fail to intersect are shown in Figure 3.9 to help illustrate this. In any case, this 
implies that 


2 2 2 2 2, 2: 2 2 2 2 
d Sy = 2 ie) Soe TOy .. eon Sig ey _ ag 
2rir2 -_ 2r\r2 2rir2 


and 
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1e\ | @ 


Fig.3.10 Two pairs of externally tangent circles and two pairs of internally tangent circles. 


2 2 2 2 2 2 2 2 
d TNT, Man) ee 
— —- > 


2rir2 = 2rir2 2rir2 


which can be summarized, by appealing to the geometric interpretation of the inver- 
sive distance, as |(C1, C2);| < 1. oO 


We can say more. 


Theorem 3.13 (Inversive Distance Angle Formula) Let C,, C2 be two oriented 
circles. They intersect if and only if |(C, C2) | < 1 and if they do, then \(C,, C2)7| = 
| cos(¢)|, where ¢ is the angle between them. 


Proof Since angles, interiors, and the inversive distance are all preserved by MGbius 
transformations, we can reduce to the simple case where C; is the real line traversed 
from left to right and C2 is a line through the origin. In order for the angle between 
C, and C2 to be ¢, C2 must be traversed in the direction e*'? Then 


KK, + K2K} = oe at 
(C1, Ca)| = |-2 5"! — RES) = |RG-ie*4| 
= |REe**)] = | cos(g)|, 
finishing the proof. Oo 


Remark 3.5 Henceforth, we shall simply take the convention that the angle ¢ 
between two intersecting oriented circles is the unique real number —z < 6 <a 
such that (C;, C2); = cos(¢). 


Definition 3.8 We say that two oriented circles C,, C2 are externally tangent if 
either they intersect at a single point and their interiors do not intersect. We say that 
two oriented circles are internally tangent if either they intersect at a single point 
and their interiors intersect. 


Figure 3.10 shows both internally and externally tangent circles. 
Corollary 3.4 Let C,, C2 be two oriented circles. 
. (Cy, C2)7 = 1 ifand only if C\, C2 are externally tangent or C, = —Co. 


1 
2. (Ci, C2); = —1 ifand only if C,, C2 are internally tangent or C, = C2. 
3. (C1, C2); = 0 ifand only if C,, C2 are orthogonal to each other. 
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Proof The proof is left as an exercise to the reader. (See Exercise 3.2.9.) oO 


> Example Determine whether the two circles with centers 1+i and 3 —4i and 
radii 2 and 5 intersect. If they do, determine the angle of intersection. 
The square of the distance between the centers is 


@ =|3-4i-(1+d? = (2 —5i? =4425 = 29, 
After this, we simply compute the inversive distance. 
d?—r?-r?  29-4-25 | 
Qrirg 22-5 


Thus, the circles don’t just intersect—they are orthogonal to one another, which is 
to say that the angle of intersection is 2/2. 


(Cj, C2); = 


3.6 Steiner’s Porism Revisited 


We are finally ready to look at the two examples of applications of inversive geometry 
a little more closely. We shall start with Steiner’s porism. Previously, we managed 
to prove that whether or not a Steiner chain is open or closed does not depend on 
the choice of starting point used in the construction. However, we did not give any 
simple criterion for determining whether a Steiner chain is open or closed. We shall 
now rectify this. 


Theorem 3.14 Let C,, Cz be two non-intersecting generalized circles. The Steiner 
chain that they define is closed and contains n circles other than the initial two if 
and only if 


I(C1, C2) 7| = 2 sec(x/n)? — 1. 


Proof We already know that any Steiner chain can be reduced via MGébius transfor- 
mations to the case where both circles are concentric at the origin, as in Figure 3.11. 
The inversive distance is invariant under Mobius transformations, so if the theorem 
is true in that case, then it must be true in the general case. But 

\d* —r? —r3| _ retry 


C1, C2)1| = = 
(C1, C2)7\ Se Sh 
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Fig. 3.11 Steiner chains of various lengths, in the standard configuration where the constructed 
circles all have centers on the unit circle. 


and if we rescale so that the inner circle has radius 1, then this further simplifies to 


a eee 
BVA 


There can be at most one radius rz such that the resulting Steiner chain is closed with 
n circles. On the other hand, the function 


(1, 00) > (1, «) 


1 a0 
ke -— — 
7 2 “ x 


is bijective—its inverse is x +> x + /x* — 1. Therefore, there is at most one value 
for the inversive distance such that the resulting Steiner chain is closed with n circles. 
It remains to compute the inversive distance for the circles in such a chain. Such a 
chain is easy enough to construct. We take the circles in the chain to have centers 
on fork = 0,1,2,...n— 1 so that they are equidistant around the unit circle; for 
them to be tangent, their radii have to half the distance between these centers, or 


1 . fU\2 _ (t 
= —,/4sin (=) = sin (=) : 
2 n n 


There are two circles with centers at the origin that are tangent to all of these circles— 
one has radius 1 — sin(z/n) and the other has radius 1 + sin(z/n). The desired 
invariant is thus 
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C1, Cos] = re + rs i= sin(z/n))? + (1 + sin(a/n)) 
, 2r{r2 21 — sin(az/n))(1 + sin(z/n)) 
2+2sin(z/n)* 
~ 21 — sin(z/n)2) 
2 — cos(x/n)* 
= cos(z /n)? 


= 2sec(a/n)? — 1, 


precisely as claimed. Oo 


Corollary 3.5 Let C, Cz be two generalized circles. They define a closed Steiner 
chain if and only if 
1 


2 
ree ( mis) 


is an integer. 


Proof We know that C; and C2 intersect if and only if |(C;, C2)| < 1. However, for 


non-negative x, 
-1 
2 
arccos 
1+x 


is defined (or, at least, a real number) if and only if x > 1. In this same range, it is 
easy to check that 


Xb 
2 
arccos ( Thx 
is the inverse function to x +» 2 sec(z/x)* — 1, and so the claim is an immediate 
consequence of Theorem 3.14. Oo 


One possible way to generalize closed Steiner chains is to allow the constructed 
circles to intersect, rather than halting once it becomes impossible to add another non- 
intersecting circle. This gives analogs of Steiner chains that “wrap around” in some 
sense, as in Figure 3.12. Sometimes, these chains keep wrapping forever. Sometimes, 
they eventually overlap on top of themselves. I leave determining when both cases 
occur as an interesting problem for the reader. 


102 3 Applications of Inversive Geometry 


ZN 
COATS 
Rk SS Us 


Fig.3.12 Some examples of generalized Steiner chains. 


3.7. Apollonian Gaskets Revisited 


When we originally defined the Apollonian gasket, we did not worry about the 
orientation of the starting four circles. This is inconvenient, but the good news is 
that it is easy to see that, given any Descartes configuration, there is only one way to 
choose an orientation on the four generalized circles such that their interiors do not 
intersect. Furthermore, when we do this, the four oriented circles are then externally 
tangent and so we get that if C1, C2, C3, C4 are the four oriented circles, it must be 
that 
rs ale 
-1 ifi=j. 
With this in mind, we make the following definition. 


Definition 3.9 An oriented Descartes configuration is a quadruple of oriented circles 
that are all externally tangent to one another. 


It is easy to see that we have shown the following. 


Lemma 3.4 Four oriented circles C,, C2, C3, Ca form an oriented Descartes con- 
figuration if and only if 
1 fi Fj 
(Ci, Cj)1 = ne 
-1 ifi=j. 
We already know that we can use Descartes swaps to get new Descartes configu- 
rations from old ones. In fact, it is easy to see that they generated oriented Descartes 


configurations from existing ones. As a consequence, it is natural to make the fol- 
lowing definition. 


Definition 3.10 Let C,, C2, C3, C4 be an oriented Descartes configuration. The ori- 
ented Apollonian gasket with starting configuration C,, Co, C3, C4 is the smallest 
set S of oriented circles in C P! such that 
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Fig. 3.13 A selection of oriented Apollonian gaskets. 


1. S contains C1, C2, C3, C4 and 
2. if C}, C5, C3, C4 are all oriented circles in S that form an oriented Descartes 


configuration, then all of the Descartes swaps of these circles are also in S. 


Some examples are drawn in Figure 3.13. The oriented Apollonian gasket differs 
from our previous definition exclusively in that all we have done is added an orien- 
tation to all of the circles in the collection. However, this will be easier to show once 
we have found a more convenient way to describe the oriented Apollonian gasket. 
We begin by defining a nice subset of the gasket. 


Lemma 3.5 Let A be the oriented Apollonian gasket with initial configuration C\, 
C2, C3, C4. Let Ga be the Apollonian group. Define 


Pa= {ca €Ga, CE (C102, €x,Ca) 
Then Pa C A. 


Proof Choose any g € Ga. Let y; be the reflection through the dual circle that swaps 
out C;. We can write g = y;, 0 y;,_,0--.0i, forsome 1 <i), i2,...in < 4. Define 
8k,l = Viz Vin) 0 +++ © Piz. Notice that (g%,1(C1), 8%,1(C2), 8k,1(C3), x,1(C4)) is 
the Descartes swap of (gx,2(C1), g%,2(C2), gk,2(C3), gk,2(C4))—this is because the 
image under a linear fractional transformation of a Descartes swap is a Descartes 
swap. Now, we shall prove by induction that for any element g € G4 that can be writ- 
ten as a composition of no more than k of the y;, g({C1, C2, C3, C4}) € A. Indeed, 
if k = 0, this is obvious, since g = id. Otherwise, assume it is true for k — 1— 
we already saw that (gx%,1(C1), gx,1(C2), gx,1(C3), gx,1(C4)) is the Descartes swap 
of (gx,2(C1), gk,2(C2), 8k,2(C3), gx,2(C4)). However, gx,2 can be written as a com- 
position of k—1 y;’s, hence {gx,2(C1), gx,2(C2), 8k,2(C3), Bk,2(C4)} C A. However, 
since A is closed under Descartes swaps, it follows that g(C1), g(C2), g(C3), g(C4) € 
A as well. We conclude that P4 C A. oO 


In actuality, A = P4. However, to prove this, we shall need a few lemmas. 


Lemma 3.6 Let C, be the line y = 0 traversed from right to left, Cz be the line 
y = 1 traversed from left to right, C3 be the circle x? + (y/2)* = 1/4 oriented 
counter-clockwise, and C4 be the circle (x — 1)? + (y/2)* = 1/4 oriented counter- 
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Fig.3.14 The standard oriented Descartes configuration is depicted in yellow. The Descartes swaps 
are drawn in purple. The dual circles are drawn in red. 


clockwise. Let A be the oriented Apollonian gasket with initial configuration C1, C2, 
C3, C4. Let Pa be defined as in Lemma 3.5. Then 


Pac {+7.c1 


y € SL(Q, aip| 
where 


SL(2, Zli))= (Cee bo + bhi 


cotc i dg t+ a esh@,0) 


ag, 41, bo, bi, Co, C1, do, di) € z} : 


Remark 3.6 The set Z[i] is known as the Gaussian integers—defined as the set of 
complex numbers a + bi where a,b € Z—and is of fundamental importance in 
elementary number theory. We will not pursue this further in this text, but it is a 
common feature of number theory books such as Rosen’s [13]. 


Proof This configuration is illustrated in Figure 3.14. The reflections through the 
dual circles are 


(2 - iF +2i 
aa Yr a Teer 

iZ —2i iz 
g3(z) = ga(z) = —. 


i i 
This way of writing them might seem odd—it is chosen to demonstrate that g;(z) = 
Y (yi) o conj for 


where each y; is in SL(2, C). In fact, something stronger is true—each y; is an 
element of the set 
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agtayl bo + bi 


cotci do + |) =a) 


SL(2, Z[i])= (( ao, 41, bo, b1, co, C1, do, d1 € Zr. 
This set is a group (see Exercise 3.3.2), but really we only need that it is closed under 
matrix multiplication, which is very easy to check, and that if y € SL(2, Z[i]) then 
there exists some 7 € SL(2, Z[i]) such that 


conjo V(y) = Y¥(y) oconj 


which is also very easy to check. One final thing to note is that for j = 2, 3,4, 
yj-C, = —C; if we take 


~  fli\~  /fOi\~ — filti 
12> 01 3 | w=, 1 . 


Note that 72, 73, ya € SL(2, Z[i]) as well. By these observations and the definition 
of P4, we conclude that every element in it can be written in the form +y.C, for 
some y € SL(2, Z[i]). oO 


Lemma 3.7 Let A be an oriented Apollonian gasket with initial configuration C\, 
C2, C3, C4, let G4 be the Apollonian group, and let P, be defined as in Lemma 3.5. 
If D\, Dz € Py intersect, then they are tangent. 


Proof Since linear fractional transformations preserve oriented circles and tangen- 
cies, we can assume that the initial configuration is the standard one defined in 
Lemma 3.6. Thus, we know that Dj = +y71.C; and D2 = +y2.C2 for some 
¥1,72 € SL(2, Z[i]). By Corollary 3.3, we know that D; and Do intersect if and 
only if |(D,, D2)| < 1. But by the invariance of the inversive distance, we see that 


\(Di, Da)| = (y1-C1, ¥2-Ca)l = |(Ca, 97 y2-Ca)). 
Let 


1, fag tai bo + bi 
yar 2= cotcidgtdi) 


Noting that inv(C;) = (0, 0, —/), we see that 
(C197 | y2-C2)| 


_1f, -\sr (0-7 

BONE OI! AG 0 

= ag + qi bo + byi 0 -i ay — ai co — CyI 0 -i a 
7) cotcidg+di i 0 bo — bhi do — di i O 


= |agdo + aid) — boco — bic\|. 


We can actually say a little bit more, because we know that by definition, 
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- dog taibot+bi 
1 = det (ie + ci dot a)) 
= aodo — aid, — boco + bic, +i (ado + aod — bico — boc1) 


whence 


|(D1, D2)| = |aodo + aid, — boco — bic} | 
= |aodo + aid, — boco — bic, + 1 — (ado — ayd — boco + b1c1) | 
= |1 + 2(aid — bic})|. 


However, ajd, — bic, is some integer, so this implies that |(D , D2)| is a positive 
odd integer, which means that if |(D;, D2)| < 1 then |(D;, D2)| = 1. oO 


Lemma 3.8 Let A be an oriented Apollonian gasket with initial configuration C,, 
C2, C3, C4, let Ga be the Apollonian group, and let P 4 be defined as in Lemma 3.5. 
Let D,, D2, D3, D4 be a Descartes configuration of circles in P 4. Then there exists 
an element y € Ga such that {D,, Dz, D3, Da} = {y (C1), y (C2), » (C3), y (C4)}. 


Remark 3.7 Note that the statement of the lemma refers to Descartes configurations, 
rather than oriented Descartes configurations. This is not a mistake. One consequence 
of this lemma is that all Descartes configurations inside P,4 are oriented Descartes 
configurations automatically. An illustration of how to reduce an arbitrary Descartes 
quadruple to the base one is shown in Figure 3.15. 


Proof Let g; be the reflection through the dual circle defined by the fact that g;(C;) = 
C; if i A j—by definition, we know that every element in G, can be written as a 
product of these g;s. Furthermore, by the definition of A, Dj = y1C; for some y; € 
Ga andi € {1, 2, 3, 4}—-without loss of generality, we shall assume that Dj = y1C}. 
There will be many different choices for y € G4 such that D; = y C1; choose 7; so 
that if we write y, D2 = y2C; from some y2 € Ga and some j € {1, 2, 3, 4}, then 
y2 can be written as a product of the g;’s in the shortest possible way. We claim that, 
in fact, y2 is the identity. 

Suppose not, and write yo = gi, Zi, ... i, for some integers 7), i2,...ix in the 
shortest possible way. Notice that ifi, 4 1,thenifwe took y; = 1g;,, we would have 
vf (Di) = 83! QO, (Dd) = 8; (Ci) = C1 and y{7!(D2) = 7! (D2) = 
iy --- 8i,(Cj). Due to the way that y; was chosen, this is impossible; hence, i; = 
1. What is more, we can see that i} # iz, i2 ~ i3, and so on—this is because 
otherwise we could cancel out the corresponding swaps and get a shorter way of 
expressing 72. Finally, it must be that i, = j—otherwise, we could write 5" (D2) = 
Bi Bin - + Siz_, (Cj) since g)(C;) = C; ifl A j. Now, gi, will move the interior of 
C; into the interior of the j-th dual circle; y;,_, will move it into the interior of the 
iz, —th dual circle, and so on. The last inversion will be g;, moving this set into the 
interior of the first dual circle. Thus, y; ! Dy» must be contained in the interior of the 
first dual circle. We know that D, is tangent to D2, so it must be that y, Di =C, 
is tangent to D2. But C, does not intersect the first dual circle so it cannot possibly 
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Fig. 3.15 (a) shows a Descartes configuration inside an Apollonian gasket in close up; (b) shows 
the same, but zoomed out. In (c), the smallest circle in the configuration is moved to the largest 
circle in the gasket. In (d), this move is chosen such that one of the other circles in the configuration 
is moved to one in the base quadruple. (e) and (f) show how the remaining two circles are moved 
into their correct positions. 


be tangent to a circle contained in its interior. This is a contradiction, and so we 
conclude that indeed yz is the identity. 

Since 72 is the identity, without loss of generality, we can assume that yO (D2) = 
C2. Furthermore, for simplicity, we can take C, to be the real line traversed from 
right to left and C2 to be the line y = | traversed from left to right. Then v7 (D3) 
and y, (Ds) must be circles tangent to both of these lines. This implies that they 
must be two circles with radii 1/2 and centers x9 + 1/2 and x9 + 1 +i/2. However, it 
is easy to see that P4 contains all such circles with xo an integer, and so y, : (D3) and 
vO (D4) will have to intersect them. By Lemma 3.7, they must actually be tangent 
to them, which can only happen if xo is an integer. Thus, by applying g3 and ga, 
we can move y, | (D3) and y, (Ds) onto C3 and C4. Without loss of generality, 
we can assume that oi (D3) = y2(C3) and y, (Ds) = y2(C4) for some y2 which 
is a composition of g3’s and gis. Then if we take y = y1y2, Dj = yj; (Cj) for 


i = 1,2,3,4, as desired. oq 


We now fully understand the circumstances under which a Descartes configuration 
can appear inside the Apollonian gasket. As an immediate consequence, we get the 
following theorem which describes in detail what the gasket looks like and what 
properties it has. 


108 3 Applications of Inversive Geometry 


Theorem 3.15 (The Structure Theorem for Apollonian Gaskets) Let A be an 
oriented Apollonian gasket with initial configuration C,, C2, C3, C4. Let Ga be the 
Apollonian group. Then 


A=Pa= [ro €Ga, C € {C1, C2, C3, Ca}. 
Furthermore, it satisfies the following properties: 


1. Every Descartes configuration in A is the image of {C,, C2, C3, C4} under some 
element y € Ga. 

2. If we forget about orientation, then the circles in A are the Apollonian gasket 
with initial configuration C,, C2, C3, C4. 

3. For every pair of circles D,, Dz € A, if D, and D3 intersect then either D, = D2 
or they are externally tangent. 

4. If D,, D2, D3, D4 is an oriented Descartes configuration in A, then A is also the 
oriented Apollonian gasket with initial configuration D,, D2, D3, D4. 


Proof By Lemma 3.5, A is contained in the given set P4. By Lemma 3.8, every 
Descartes quadruple in P4 can be obtained as {y (C1), y (C2), y (C3), y (C4)} for 
some y € G4. But this means that P, also contains {y (C1), y (C2), y (C3), y (C4)} 
where Ci is the Descartes swap of C4—from this, we see that P,4 contains all 
Descartes swaps of circles in P4. By the definition of A as the smallest set con- 
taining the Descartes swaps, A = P 4. It immediately follows that every Descartes 
configuration in A is the image of {C;, C2, C3, C4}. It is clear that A is contained 
inside the Apollonian gasket with initial configuration C;, C2, C3, C4a—however, 
since it is closed under Descartes swaps of all Descartes configurations, these two 
sets must actually be equal since the Apollonian gasket is defined to be the smallest 
such set. Finally, choose two circles D;, D2 € A which intersect. By Lemma 3.7, D 
is tangent to D2. As in the proof of Lemma 3.8, we can find an element y € G4 such 
that y (D1), y (D2) are in {C1, C2, C3, C4}. For the circles in the initial configuration, 
they are either equal or externally tangent, so this must be true of D; and D2 as well. 
For the last part, by Lemma 3.8, there exists some g € Ga such that 


{C1, C2, C3, Ca} = {97 (D1), 87 ' (D2), g7'(D3), g~ | (Da)}. 


Let y1, 72, 73, y4 be the Descartes swaps of C1, C2, C3, C4. Then it isn’t hard to see 
that go y; og”! will be the Descartes swaps of D, Dz, D3, D4. As the map 


Ga>Ga 
yo goypog” 
is a bijection, we see that actually G4 can also be described as the smallest sub- 
group Mob(2) containing the Descartes swaps of D;, D2, D3, D4. This implies that 


C1, C2, C3, C4 are contained in the oriented Apollonian gasket with starting config- 
uration D;, Dz, D3, D4 since, as stated above, 


{C1, C2, C3, Ca} = {g~'(D1), & | (D2), ¢ | (D3), ¢ | (Da)}. 


1 
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Fig.3.16 A sculpture that hung in the math department at CUNY’s Graduate Center when I was a 
post-doc there, along with a sculpture that could have hung there, but didn’t. 


But that means that it is the smallest collection of oriented circles which contains 
C1, C2, C3, C4 and which is closed under Descartes swaps, which means that it is 
just A. oO 


Thus, at long last, we know that illustrations like in Figure 3.16 really are repre- 
sentative of Apollonian gaskets. 


3.8 Descartes’ Theorem 


In 1643, René Descartes wrote a letter to Princess Elisabeth of the Palatine, with 
whom he held regular correspondence primarily on questions of philosophy. Included 
in that letter was the following result, which we state in more modern language. 


Theorem 3.16 (Descartes’ Theorem) Let C;, C2, C3, C4 be an oriented Descartes 
configuration. Let b,, bz, b3, ba be their bends. Then (by +by+b3 +b4)? = 2(b7 + 
b5 + b3 + bj). 


Remark 3.8 We don’t know exactly what Descartes’ proof was, but I can say with 
confidence that it was not the proof that shall be given here. This is because our proof 
makes heavy use of matrix multiplication, which was only first described in 1812 by 
Jacques Philippe Marie Binet—this proof is most likely newer still. There are many, 
many other known proofs [9]. 


Proof We know that C,, C2, C3, C4 are a Descartes configuration if and only if 
(C;,C;) = lifi A j, and —1 otherwise. If we take the inversive coordinates of C; 
to be (b;, ci, xj, yi), then this can be stated as 
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by bo b3 by\" (04 0 0\ (by by b3 ba -11 1 

C1 C2 C3 C4 4 00 0 cyc2¢e3ce4] [| 1-1 1 i 
X1 X2 X3 X4 00-1 0 X1 XQ X3 X4 1 1-11 
yl y2 Y3 4 00 0 -l yi Y2 Y3 Y4 1 1 1-1 


This can be expressed a little more compactly as D’ MD = R if we call 


by by b3 ba 050 0 -11 11 

1 
_ | 1 C2 €3 C4 _— {70 0 0 —_f il -l1 1 
a= X1 XQ X3-X4 a) (oc ee (nae ee a 
Y1 y2 Y3 Ya 00 0 -1 1 1 1-1 


It follows that M = (D")~'!RD7—'. But if we take the inverse of both sides, we will 
get the relation DR~'D? = M~!. One checks that R~' = R/4 and 


020 0O 
(20°: 0 
oie Fee oc 
00 0 -1 
sO 
bj bo by bs\ [-1 1 1° 1\ [bdo b3b4\" [08 0 0 
Cy, C2 C3 C4 1-1 1 1 Ci C2 C3 C4 = 80 0 0 
xX] X2 X3 X4 1 1-11 x1 x2x3x4] £2|100-4 0 
yi 2 Y3 4 1 1 1 —1/ \y1 y2 y3 ya 00 0 —-4 
and in particular 
-1 111 bi 
Ciba bxby |) ot 


1 1-11 |]; 
1 1 1-1) \b, 


Multiplying out the expression above, one gets 
—bi — b5 — bj — bi, + 2bi bo + 2b1b3 + 2biby4 + 2bab3 + 2bab4 + 2b3b4 = 0, 
or 
2(bj + bs + b3 + bg) 
= bt + b5 + b5 + 7 + 2byb2 + 2b1b3 + 2bi bg + 2bob3 + 2bob4 + 2b3b4 
= (bi + by +b3 + ba)’, 


as was claimed. oO 
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Corollary 3.6 Let C,, C2, C3, C4 be an oriented Descartes configuration. Let by, 
bo, b3, ba be their bends. If C4 is the Descartes swap of C4, then its bend is bi, = 
2b; + 2b2 + 2b3 — by. 


Proof Since C,, C2, C3, C 4 is also an oriented Descartes configuration, we must 
have that b4, b/, are both solutions to the quadratic polynomial equation 


(by + bp +b3 + X)* = 2067 + b3 +3 + X?), 
which is more conveniently rearranged as 
X? — 2(b) + bp +: b3)X + (63 + dS + bf — 2b1 by — 2b1b3 — 2b2b3) = 0. 
The sum of the roots must be b4 + bi, = 2(b, + bz + b3), whence the result. oO 


Descartes’ theorem and its corollary were rediscovered many times, including by 
English radiochemist Frederick Soddy in 1936, who then wrote the poem “The Kiss 
Precise” about it which was published in Nature [15]. It would be remiss for me not 
to include at least an excerpt from it. 


Four circles to the kissing come. 

The smaller are the benter. 

The bend is just the inverse of 

The distance from the center. 

Though their intrigue left Euclid dumb 
There’s now no need for rule of thumb. 
Since zero bend’s a dead straight line 
And concave bends have minus sign, 
The sum of the squares of all four bends 


Is half the square of their sum. 


One of Soddy’s other contributions to the study of Descartes configurations is the 
following observation [16]. 


Corollary 3.7 Let A be an oriented Apollonian gasket with initial configuration C,, 
C2, C3, C4. If the bends of C1, C2, C3, C4 are integers, then all of the bends in A 
are integers. 


Proof If the bends are integers b,, b2, b3, ba, then bi, = 2b, + 2b2 + 2b3 — bg is 
also an integer. Since every circle in the oriented Apollonian gasket is produced via 
Descartes swaps, all of them will have integer bends. oO 


It is common to call an Apollonian gasket integral if the bends of all the circles 
in it are integers. Some examples are shown in Figure 3.17. Ever since Soddy first 
made this observation, number theorists have been interested in learning more about 
integral Apollonian gaskets and, in particular, what sort of integers show as bends of 
such geometric objects. While there has been extensive progress on this question, at 
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Fig.3.17 A collection of integral Apollonian gaskets, drawn with their bends. 


the time of writing, it is still open, with the best-known result due to Jean Bourgain 
and Alex Kontorovich [1]. Most of the known partial results make use of heavy 
algebraic and analytic machinery, so we will not include them here. 
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Problems 


3.1 COMPUTATIONAL EXERCISES 


NOR 


lon 


. Draw a closed Steiner chain of length at least 8. 
. Find a closed Steiner chain of length 6 such that both of the following two con- 


ditions hold. 


a. One of the circles in the chain has inversive coordinates (—1, 1, 0). 
b. One of the circles defining the chain has inversive coordinates (3, 1, 2). 


To specify the chain uniquely, it is enough to give the inversive coordinates of the 
other circle defining the chain; if you want to be a real go-getter, you can find the 
inversive coordinates of the circles in the chain as well, but be advised that this 
is significantly harder. 


. a) Suppose that b), b2, b3, b4 are the bends of the initial configuration of an 
Apollonian gasket. Let b/, be the bend of the Apollonian swap of b4. Check that 
100 0 by by 
010 0 bo] |b 
001 0 bz] | bs 
222-1 b4 bi, 


b) Find the matrices such that multiplying them by (1, b2, b3, b4) gives the other 
three Apollonian swaps. 


. Find oriented circles Cy, C2, C3, C4 which form a Descartes configuration, and 


such that their bends are 0, 1, 1, 4, respectively. 


. Find oriented circles Cy, C2, C3, C4 which form a Descartes configuration, and 


such that their bends are —1, 2, 2, 3, respectively. 


. Calculate the bends of all of the circles that one can get from the standard oriented 


Descartes configuration in no more than ten Descartes swaps. (You will want a 
computer for this.) Investigate the set that you find—can you make any conjectures 
about what it does and does not contain? 


3.2 PROOFS 


1. 


Let C1, C2 be two perpendicular generalized circles. For any point p on C1, prove 
that there exists a unique generalized circle C3 that is perpendicular to both C, 
and C2 and which passes through p. (Hint: use a linear fractional transformation 
to move one of the points where C, and C2 intersect to 00. How does this help?) 
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2. Let C be a generalized circle that is perpendicular to y = 0 and x = 0. Prove 
that the center of C is 0. 

3. Define a Pappus chain as follows: we start with two generalized circles Cy and C2 
that intersect in a single point. Choose a point p on Cy and construct a generalized 
circle Do that is tangent to C2 and tangent to C; at p. Add on the generalized 
circles that are tangent to Do, C1, and C2—we claim that there are two of them, 
D, and D_,. Add on the generalized circles that are tangent to Dj, Ci, and C2, 
and tangent to D_;, C1, Co—we claim that there are again two of them, D2 and 
D_». Continue inductively: at each step, add circles Dj and D_; that are tangent 
to Dg—-1, Ci, and C2 and tangent to D_x41, C1, and C2. The Pappus chain is the 
union of Cj, C2, and all of the D;’s. 


a) Draw a picture of a Pappus chain. 

b) Prove that given p, C;, and C9, there exists a unique generalized circle Do that 
is tangent to C2 and tangent to Cy at p. 

c) Prove that there exist two generalized circles D,, D_, that are tangent to Cj, 
C2, and Do. 

d) Use induction to prove that for every integer k, there exist two generalized 
circles Dz; and Dz_, that are tangent to Dz, C;, and C2. 

e) Prove that if two circles in the Pappus chain intersect, then they are tangent. 

f) What effect does changing the initial point p have? 


4. Our goal is to prove that the Apollonian gasket with starting configuration C, 
C2, C3, C4 is equal to the set A defined as 


A = U An, 
n=1 


where .A, is the starting configuration, A, consists of all Descartes swaps of 
quadruples in Aj, A3 consists of all Descartes swaps of quadruples in Az, and 
so on. 


a) Prove that the Apollonian gasket contains each A,,. (Hint: you may want to use 
induction.) 

b) Prove that the Apollonian gasket contains A. 

c) Prove that A contains all Descartes swaps of elements in A. 

d) Conclude that A is the Apollonian gasket. 


5. Prove that for any oriented circle C, inv(—C) = —inv(C). 
6. We confirm some of the basic properties of the conjugate transpose. 
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a) Prove that (AB) = A B for all 2 x 2 matrices A, B with complex coefficients. 
Here, 


(<0) = Ca), 


b) Prove that if A € SL(2,C) then A € SL(2,C). 
c) Prove that (AB)! = B’ A’. Here, 


ab . _ fac 
cd} ~ \bd)° 
d) Prove that if A € SL(2,C), then A? € SL(2,C). 


e) Prove that (AB) — aa for all2 x2 matrices A, B with complex coefficients. 
f) Prove that if A € SL(2, C), then A’ € SL(2, C). 


7. Prove that every matrix SL(2, C) can be written as a product of matrices of the 


forms 
u O lt 01 
Ou'}’\o1)’?\-10)° 


. a) Let M, N ben x n matrices with i, j-th entries a;,; and b;,;, respectively. By 
the definition of matrix multiplication, the i, j-th entry of MN is 


n 
Ci,j = > i ,kDk, j- 
k=1 


Use this observation to prove that tr((MN) = tr(VM). 


Co 


b) Use your answer to the previous part to show that for any n x n matrix M and 
any G € GL(n, C), tr(GMG™!) = tr(M). 


9. Prove Corollary 3.4. 
10. The following describes an analog of the Apollonian gasket which comes from a 
paper by Guttler and Mallows [3]. 
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a) 


b) 


c) 


d) 


e) 
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Prove that for any circles C;, C2, C3 that are externally tangent to one another, 
there exist exactly two triples of circles D;, D2, D3 such that D;, D2, D3 are 
externally tangent to one another, D, is externally tangent to C2 and C3, D2 is 
externally tangent to C; and C3, and D3 is externally tangent to C; and C2. (Hint: 
use a linear fractional transformation to move C,, C2, C3 into a configuration 
that is easier to think about. There are a number of choices for how to do this.) 
We shall call a sextuple C1, C2, C3, D1, D2, D3 satisfying the properties above 
an octahedral configuration. Prove that six oriented circles Cy, Cz, C3, D1, Do, 
D3 form an octahedral configuration if and only if 


1 ifi #j 1 ifi¢j 
C;,Cj)r = D;, Dj); = 
ecg ge eee {\, ifi=j 

1 ifi¢ j 
(Cj, Dj)r = ; <i 

—3 ifi=j. 


(Hint: use the fact that the inversive distance does not change under linear 
fractional transformations, and force the sextuple into a position where it is 
comparatively easy to compute what the inversive distance is for an octahedral 
configuration. ) 

Draw an octahedral configuration. Choose any point in the interior of each 
oriented circle. For any two points, connect them by a line if their corresponding 
circles are externally tangent. Looking at the figure you have drawn, can you 
see why this configuration is called octahedral? (Hint: what would the vertices 
and edges of an octahedron look like if you squashed them flat onto a plane?) 
Given an octahedral configuration C;, C2, C3, D1, D2, D3, denote by D‘, 
Ds, D4 the other three oriented circles such that C}, C2, C3, Dj, D4, D% 
is an octahedral configuration. Call the map (C,, C2, C3, D1}, Dz, D3) b 
(Ci, C2, C3, D}, Dy, D3) an octahedral swap. Show that any octahedral swap 
can be understood as an inversion through some circle. (Hint: prove this for a 
standard octahedral configuration first.) 

Prove that if C}, Co, C3, D1, D2, D3 is an octahedral configuration, then 
inv(C1) + inv(D)) = inv(C2) + inv(D2) = inv(C3) + inv(D3). (Hint: first, 
show that if this holds true for one octahedral configuration, then it is true for 
all octahedral configurations. Then prove it for a convenient choice of octahedral 
configuration. ) 

Given an octahedral configuration C;, C2, C3, D;, D2, D3, define its condensed 
coordinates to be the quadruple vj = inv(C,), v2 = inv(C2), v3 = inv(C3), 
v4 = inv(C;) + inv(D,). Prove that the condensed coordinates of an octahedral 
configuration specify it uniquely. 
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g) Consider an octahedral configuration with condensed coordinates 01, v2, 03, 04. 
If V is a matrix with columns v1, v2, 03, v4 and 


0 -1/200 

_{-1/2 0 00 
M=1 0 0 10]? 

0 0 Ol 

prove that 

1 -—1 -1-2 
T _p». f{-l 1 -1-2 
VIMV=R:= fe 4 
—2 —2 -2 -4 


(Hint: you can prove this by passing to a standard configuration, but it is probably 
easier to use the result of part b).) 

Let C1, Co, C3, D,, Dz, D3 be an octahedral configuration. Let the bends of 
C1, C2, C3, Di, D2, D3 be by, b2, b3, di, do, d3. Prove that (d, d2,d3) = 
(2X — bi, 2X — b2,2X — b3), where X is a root of 


X? — 2X (b) + bz + b3) + be + b3 + b3 = 0. 


h 


wm 


(Hint: use the previous exercise and emulate the proof of Descartes’ theorem.) 

i) Let Ci, Co, C3, D1, D2, D3 be an octahedral configuration with bends by, bo, 
b3, d, dz, d3. Let Cy, C2, C3, D},D5,D4, be the octahedral swap where the new 
bends are d}, d, di. Prove that dj = 4(bj + b2 + b3) — 2b; —d; fori = 1,2, 3. 
(Hint: use the previous exercise and emulate the proof of Corollary 3.6.) 

j) Let Ci, C2, C3, D1, D2, D3 be an octahedral configuration. An octahedral 
packing with initial configuration C;, C2, C3, D1, D2, D3 is the smallest set of 
oriented circles that satisfies the following two properties. 


a. The set contains Cj, C2, C3, D1, D2, D3. 
b. The set contains all octahedral swaps of elements in the set. 


We say that an octahedral packing is integral if the bends of all the circles 
contained inside of it are all integers. Prove that an octahedral packing is integral 
if and only if the bends of the initial configuration are all integers. (Hint: use 
the previous exercise and emulate the proof of Corollary 3.7.) 

k) Draw a picture of an octahedral packing. 


3.3 PROOFS (Group Theory) 


1. Prove that the Apollonian group is a group. 
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. Prove that SL(2, Z[i]) is a group. 

. Define PSL(2, C) to be the set of equivalence classes of SL(2,C) where two 
matrices M,, M2 are considered to be equivalent if there exists 2 € C such that 
M, = 4M). 


a) Let Mi, M),M2,M5 © SL(2,C) such that M;, Mj are equivalent and 
Mz, Mz, are equivalent. Prove that M| Mz, M} M;, are equivalent. 

b) Using the above, prove that PSL(2, C) is a group. (Hint: The main problem 
is showing that it has a well-defined multiplication on it. The previous part 
suggested how to do this.) 

c) Prove that 


g: PSL(2,C) > PGL(, C) 
MwrM 


is a well-defined group isomorphism. 


. Given a set S and a group G, an action of G on S isa function A: Gx S— S$ 
satisfying the following properties. 


a. If zis the identity of G, then AV, s) =s foralls € S. 
b. Forall g,h € Gandalls € S, A(f, A(g, s)) = A(fg,s). 


Wherever it is unlikely to cause confusion, it is customary to write g.s instead 
of A(g, s). 


a) Let X be any set. Define Sym(X) to be the set of bijective functions f : X > 
X. Prove that Sym(X) is a group if we take the operation to be composition 
of functions. 

b) Let A be an action of G on X. Prove that 


G —> Sym(X) 
gt> (Xb Ag, x)) 


is a group homomorphism. 
c) Prove that if ¢ : G + Sym(X) is a group homomorphism, then 


A:GxxXpr~xXx 
(g,x) +> O(g)(x) 


is a group action. 

. Prove that 

Isom(C) x C > C 
(P, 2) + O() 


is a group action. 
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6. 


10. 


11. 


Prove that 
GL(2,C) x CP! > CP! 
ab Vie az+b 
cd)’ cez+d 


is a group action. (Hint: you can prove this directly, of course, but it might be a 
little more convenient to use the result of Exercise 3.3.4 and Theorem 2.4.) 


. Prove that Definition 3.6 defines a group action of SL (2, C) on the set of oriented 


circles in CP!. 


. A group action of G on X is called transitive if for every x, y € X, there exists 


g © Gsuch that g.x = y. 


a) Prove that SL(2, C) acts transitively on the set of oriented circles in CP!. 

b) Prove that the action of GL(2, C) on SL(2, C) by conjugation is not transitive. 
(Hint: find two matrices M,,M2 € SL(2,C) with different traces. Is it 
possible that y M,y~' = Mp for some y € GL(2, C)?) 


. A group action of G on X is called free if for every g,h € G andx e€ X, 


g.x = g.y if and only if g =h. 


a) Prove that S!, the collection of all z € C with |z| = 1 is a group under 
multiplication. 

b) Prove that the action of S! on C% by left multiplication is free. 

c) Prove that SL(2, C) does not act freely on the set of oriented circles in CP!. 


Prove that PSL(2, C) acts freely, transitively on the set of distinct triples of 
points in CP!, 

Prove that the Apollonian group acts freely, transitively on the set of oriented 
Descartes configurations in the Apollonian gasket. 


® 


Check for 
updates 


Construction of Hyperbolic Geometry 


In which we construct new spaces 
on which to act. 


We have carefully studied the properties of linear fractional transformations on 
the Euclidean plane; it is now time to look elsewhere. There are many possible 
candidates for exposition but probably the single most important is hyperbolic space. 
The hyperbolic plane was the original example of a non-Euclidean space—that is, 
a geometry that satisfied all of Euclid’s axioms for plane geometry save for what is 
now known as the Fifth Postulate. 


The Fifth Postulate 


If a line segment intersects two straight lines forming two interior angles on the 
same side that sum to less than two right angles, then the two lines, if extended 
indefinitely, meet on that side on which the angles sum to less than two right 
angles. 


This formulation is the way that Euclid originally phrased it, anyway, but I per- 
sonally prefer a slightly different version known as Playfair’s Axiom. 


Playfair’s Axiom 


Given a line / and a point p not on /, there is exactly one line /’ that passes through 
p and does not intersect /. 


The idea behind hyperbolic geometry is that rather than there being exactly one 
non-intersecting line, there are instead many, as shown in Figure 4.1. Hyperbolic 
geometry was developed in the 19th century when the mathematical soil was finally 
ready for such a thing to sprout. This is evidenced by the fact that it was discovered 
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Fig. 4.1 On the left, an illustration of the Playfair axiom in the Euclidean plane. On the right, an 
example of how this axiom is not satisfied by the hyperbolic plane. 


independently by at least four different mathematicians at around the same time: 
Nikolai Ivanovich Lobachevsky, Janos Bolyai, Carl Friedrich Gauss, and Franz Tau- 
rinus. We shall develop a model of hyperbolic geometry that was originally put 
forward by Italian mathematician Eugenio Beltrami in 1868, although, in a clas- 
sic instance of Stigler’s law of eponymy, it usually carries Henri Poincaré’s name 
instead. Before we get into this properly, we will first examine what we even mean 
by a geometry for our purposes. 


4.1 Metric Geometry 


Historically, geometry was first studied axiomatically (a la Euclid), then analytically 
using coordinates (a la Descartes), and then using differential forms (a la Gauss and 
Riemann). Instead of developing any of these potent machines, we will instead leap 
forward in time to 1906 and consider Maurice Fréchet’s notion of a metric space. 


Definition 4.1 A metric space (X,d) is a set X together with a function d : X > 
[0, oo) called the metric which satisfies the following three properties. 


1. For all x, y € X, d(x, y) = Oif and only if x = y. 
2. Forallx, y € X,d(x, y) =d(y,x). 
3. Forall x, y,z € X, d(x, y) + d(y, z) = d(x, z). 


Fréchet introduced this notion in his thesis, securing its place as a foundational 
paper in mathematics. Today, metric spaces are used to phrase the basic theorems of 
real analysis, extend the notions of limits to spaces of functions (as Fréchet himself 
did), analyze error-correcting codes, and much more. But what is a metric space? 

I claim that you already familiar with at least one example of a metric space. 
Specifically, consider the set R”—that is, n-dimensional Euclidean space. You can 
take n = 2 if you want to feel more comfortable. Then, consider the Euclidean 
distance function 
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Y 
x’ ze — = — ®y 


Fig. 4.2 On the left, an illustration of the triangle inequality in the Euclidean plane; on the right, 
an illustration of the discrete metric space with three points. 


deuctia : R” x R” > [0, co) 


(G1, e+ Xn)s (v1, eeey Yn) ) Ja —y)? + weet On — Vale 


This is a metric space. Indeed, it is true that dguclia(x, y) = O if and only if x = y; it 
is true that dEuctia(x, y) = dguctia(y, x). Neither of these are difficult to prove. What 
about the last assertion? Well, all it is saying is that if we have three points x, y, z, 
then the distance from x to y plus the distance from y to z can’t be less than the 
distance from x to z—this is just the triangle inequality 


dguciid(x, y) + dEucid(y, Z) = deuclid(*, Z). 


Proving this is a little trickier (see Exercise 4.5.4), but it is intuitively clear, as 
shown in Figure 4.2. In any case, we now understand what a metric space is: it is 
just a set together with a distance function, where the notion of “distance” is just a 
straightforward generalization of the usual Euclidean one. For us, this is what we 
will mean by a “geometry”: it is a choice of metric. 

Let’s give another example of a metric space. Let X be any set whatsoever, and 
define a function 


daiscrete : X” > [0, 00) 


1 ifx~#y 


x, b> . 
(, y) 0 otherwise. 


This is known as the discrete metric, and it turns X into a metric space. I leave the 
proof for the reader—it isn’t hard. (See Exercise 4.5.2.) This example highlights that 
while metric geometry is good enough to talk about distances, it isn’t usually good 
enough to talk about angles. After all, {1, 2, 3} together with duiscrete is a perfectly 
good metric space (illustrated in Figure 4.2), but there is no reasonable way to talk 
about angles between paths in this space, particularly since there is no such thing as 
a (continuous) path between any two points in this geometry. While this is limiting, 
we’ ll see that for the examples that we consider, one can define angles in a reasonable 
way, SO we won’t worry about this too much. Other geometric notions that we defined 
in Chapter | generalize much more easily. 
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Definition 4.2 For any two metric spaces (X, dx), (Y, dy), an isometry between 
them is a function ‘¥ : X — Y such that for all x1, x2 € X, dy(¥(x1), ¥(x2)) = 
dx (x1, x2). Furthermore, 'V is called an isometric isomorphism if it is also bijective. 
We will write Isom(X, d) to denote the set of isometries of (X, d) with itself. If the 
metric d is clear from context, we will instead abbreviate this as Isom(X). 


It is often the case that all of the isometries X —> X are bijective and so the 
two notions of isometry and isometric isomorphism coincide: this was the case, for 
example, for isometries of the Euclidean plane. For general metric spaces, though, 
they can be different. Let’s see this by thinking about discrete metric spaces. If X isa 
discrete metric space, then ‘¥ : X — X is an isometry if and only if for allx, y € X, 


L if Px) #¥(Q) 
0 otherwise 


1 ifxf~¢y 
0 otherwise, 


daiscrete (Y (x ) j Wy (y)) = | 


ddiscrete(x, ¥) = | 


which is to say that x = y if and only if ¥(x) = Y(y). In other words, ’ is an 
maps that are injective but not surjective: for instance, if X = Z, the mapn b» 2n 
is just such a thing. Even so, there are restrictions on what isometries can be: any 
isometry is always injective (see Exercise 4.5.6), any composition of isometries is 
necessarily an isometry (see Exercise 4.5.7), and if an isometry has an inverse, then 
that inverse is also an isometry (see Exercise 4.5.8). 


> Example For x,y € R, define |(x, y)|1 = |x| + |y|. The taxicab metric on R? 
is defined as d\(p, p2) = |p1 — p2\1- Prove that it is a metric. 

Write py = (x1, yi) and pz = (X2, y2). Then d) (pi, p2) = |x1 — x2| + |y1 — yal}. 
This makes it evident that for all p1, p2 € R?,d, (p1, p2) = Oif and only if p; = po 
and d\(pi, p2) = d1(p2, pi). The only tricky part is the triangle inequality. Adding 
a third point p3 = (x3, y3), we note that 


di (pi, p2) + 41 (po, p3) = |x1 — x2| + ly1 — yo] + x2 — x3] + ly2 — ysl 
> |x, — x3| + |y1 — y3] = di (pt, p3), 


where in the second to last step, we used the fact that |a — b]| + |b —c| => la — cl, 
which is just the triangle inequality for (R, dguctia)- 


4.2 The Real Special Linear Group 125 
4.2. The Real Special Linear Group 


Up until this point, our general tendency has been to first describe a geometry, 
and then to work out the transformations that preserve its key properties. Thus, for 
example, we defined the plane with the Euclidean metric and then determined what 
the isometries were; just so, we defined the discrete metric in the previous section 
and then worked out the isometries. Now, we are going to flip the script: we will start 
with what we want the isometries to be, and we shall try to find a nice metric space 
that matches. One possible justification for this approach is the following. 


Philosophical Principle 


If you want to understand a group G, find a nice space X such that G can be inter- 
preted as the group of transformations X — X with some convenient properties, 
such as them being isometries. 


Later in the chapter, we will do precisely this for the full group Mob(2). For now, 
we shall try this for a somewhat smaller, easier to work with, subgroup. 


Definition 4.3 The special linear group on R*, denoted by SL(2, R), consists of 
all 2 x 2 matrices with real coefficients with determinant 1. The group PSL(2, R) 
is the image of this group in M6b? (2)—that is, it consists of all transformations 

>» @ +b 


cz+d 
where a, b,c,d € Rand ad — bc = 1. 


That SL(2, R) is a subgroup of SL(2, C) is easy to check. (See Exercise 4.2.2.) 
In the next section, we will construct a metric space X C CP! such that linear 
fractional transformations in PSL(2, R) are isometries of X. To set the scene, we 
need a subset X such that every transformation in PSL(2, R) moves it back to itself. 


Definition 4.4 The upper half-plane H is the set of points z € C suchthat 3(z) > 0. 
The boundary OHI* of the upper half-plane is HI. 


Lemma 4.1 For any 9 € PSL(2,R), 9(H’) = H’. 


Proof What is H”? It is just the interior of the real line, oriented such that i is in the 
interior. If ¢ € Méb(2), then y(H*) = Hl? if and only if g sends the real line with 
this orientation back to itself. If we refer back to the exercise at the end of Section 
3.4, we see that, indeed, if gp € PSL(2, R), then this is exactly what happens. 


A consequence of this is that it is entirely sensible to ask how elements in 
PSL(2,R) move elements in H?—an illustration is shown in Figure 4.3. This is, 
however, a little complicated. It’s a good idea to come up with some simple examples 
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a AS ine 


Fig.4.3  Anexample of how SL (2, R) moves points in H*. The illustration on the right is the image 


of the illustration left under the map z +> (ea mee) Zz 


of elements in PSL(2, R). Notice that if s € R, 2 > 0, then the maps zh z+ 17, 
zh Az, zt —z! are all in PSL (2, R) since they correspond to the matrices 


Cee Lene 


which are all in SL(2, R). Another useful transformation that preserves the upper 
half-plane, but is notin PSL (2, R), is z + —z. Together, these transformations help 
us prove nice properties about PSL(2, R). 


Theorem 4.1 The map 


SL(2,R) > { E Mev) y (H’) = | 


ab = pee 
cd cz+d 


is a surjective group homomorphism. 


Proof It is easy to check that 
|v é Mob2)| WO) = | 


is a group and so this map is just a restriction of the usual group homomorphism 
SL(2,C) — M6b°(2). By Lemma 4.1, we know that the image of SL(2, R) pre- 
serves the upper half-plane. Therefore, this is a well-defined group homomorphism; it 
remains to prove that it is surjective. Choose an arbitrary element y € M6b?(2) such 
that y (Hi) = H?—we know that there exist a,b, c,d € C such that ad — bc = 1 
and 
az+b 

von cz +d 
We shall show that in fact a,b,c,d € R. We know that y(@H*) = AH”, hence 
yw (oo) = a/c € dH? and y~!(0) = —b/a € AH. By composing with z H —1/z 
if necessary, we may assume that a 4 0, hence c/a, b/a € R. But 


4.2 The Real Special Linear Group 127 


(Ca) (0-1) = (6a): 


and this new matrix is in SL (2, R) if and only if the original one was. Thus, we may 
assume without loss of generality that b = c = 0, so y(z) = a’z. This preserves 
OH? if and only if a* € R*. Moreover, if a” < 0, then y(i) ¢ Hi. This implies that 
a” > 0, which in turn means that a € R, concluding the proof. 


Corollary 4.1 For any y € Méb(2), w(H?) = H? if and only if either yw € 
PSL(2,R) or ywod € PSL(2, R), where $(z) = —z. 


Proof If y € Méb°(2), this follows from Theorem 4.1. If y ¢ M6b°(2), then y 
composed with z +> —Z is in M6b°(2), and so the result follows. 


With this motivation, we define a group Isom(H’). 
Definition 4.5 We define 
Isom(H’) = { E Mb) y (H’) = | 


= PSL(2,R)U | ody € PSLQ,R)| , 


where $(z) = —Z. 


Remark 4.1 It is easy to check that this is a group—see Exercise 4.2.3. 


We are jumping the gun here a litthke—what we are effectively claiming is that this 
is the isometry of the upper half-plane. However, we haven’t defined a metric on H? 
yet, so it is meaningless to talk about isometries! Nevertheless, in the next section, 
we will define a metric on H? and then it really will be the case that Isom(H7”) will 
be the set of isometries H* > H?. 

Before we are ready to make that construction, though, we will need a few results 
showing how Isom(H’) and PSL (2, R) act on points in H? and its boundary. 


Lemma 4.2 For any three distinct points z1, 22,23 € OH, there exists an element 
w € Isom(H) such that y(z1) = 0, w(z2) = 1, and y(z3) = 00. 


Proof We will construct y by stages. First, if z3 4 00, define y(z) = (z3 —z)7!— 
this is the image of 
= : € SL(Q,R). 

If z3 = ov, define yi(z) = z. In either case, PSL(2, R) and y1(z3) = oo. Since 
W1(z1) 4% 00, we may then consider the translation y2(z) = z — y1(z1). This sends 
Zt > 0, 73 oo. Thus (y2 o w1)(Z2) is a non-zero real number. If it is less than 
zero, define y3(z) = —Z; otherwise, y3(z) = z. Then y3 0 y2 0 yw, € Isom(H”) 
and sends z; +> 0, z3 F ©, and z2 to some positive real number r. Finally, define 
wa(z) = z/r, 80 y4 0 W3 0 W2 0 y is the desired element of Isom(H”). 
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aa \W 


Fig. 4.4 An illustration of how Isom(H”) acts transitively on triples of points on the boundary of 
HH’; the right-hand side is the image of the left under a transformation that sends the three vertices 
of the “triangle” to 0, 1, oo. 


Theorem 4.2 For any pair of distinct triples of points z1, 22, 23 € OH’, w1, w2, w3 € 
OH?, there exists a unique y € Isom(H) such that w(z1) = wi, w(z2) = w2, and 
y(z3) = w3. 


Proof Showing existence is easy; simply apply Lemma 4.2 to produce wi, y2 € 
Isom(H’) such that yi(z1) = y2(wi) = 0, wi(z2) = wo(w2) = 1, and yi (z3) = 
y2(w3) = oo. Composing yo te y produces the desired element of Isom(H”). On 
the other hand, if 6 € MGb(2) satisfies (H*) = H*, 4(z1) = wi, (z2) = w2, and 
$(z3) = w3, then we claim that ¢ = Wo o w,. Equivalently, w3 = w2ogo Ww 
is the identity function. We know that w3(0) = 0, w3(1) = 1, and w3(co) = co— 
therefore, either y3(z) = z or y3(z) = Z. But conjugation does not preserve the 
upper half-plane, hence y3 = id, establishing uniqueness. 


An example of this transitive action on triples of points on the boundary is shown 
in Figure 4.4. We also need to know how points off the boundary are moved around. 


Theorem 4.3 For any two distinct points 21, z2 € H’, there exists exactly two ele- 
ments y € Isom(H?) such that y(z,) = i, and y(z2) = it for some realt > 1; 
one element is in PSL(2, R) and the other isn’t. Furthermore, the parameter t is the 
same for both. 


Proof First, we tackle showing that there exist two such elements. We’re going to 
carefully choose two points a,b € OH such that if we choose an element y, € 
Isom (H*) so thata +» 0 and b b> ov, then 2}, z2 will fall on the imaginary axis, 
with 3(w1(z2)) > 3(yi(z1)). Assuming we can do this, the rest of the construction 
is easy: first, define yo(z) = z/3(y1(z1)), so then yw = yz o yw; € Isom(H*) sends 
Z1 +> Land z2 + it forsome t > 1. The desired second element is given by go y, 
where $(z) = —Z. 

That there exists an element in Isom(H?) which can send arbitrary points on the 
boundary to 0 and oo is guaranteed by Theorem 4.2. But how to choose a, b to 
accomplish what we want? Consider the line / through z; and z2. There are three 
cases: 
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Fig. 4.5 A visual sketch of the existence part of the proof of Theorem 4.3. (a), (b), and (c) show 
the possible configurations of a pair of points z,, z2 € H? and the corresponding choice of a, b. (d) 
shows where y; sends Z1, z2, a, b. 


1. lis a vertical line and 3(z1) < 3(z2). 
2. lis a vertical line and 3(z,) > 3(z2). 
3. lis not a vertical line. 


In the first case, define a = K(z) and b = oo. In the second case, define a = oo 
and b = R(z1). The third case is the trickiest: consider the perpendicular bisector 
of the line segment from z; to zz. This perpendicular bisector intersects R at some 
point; construct a circle through z; and z2 whose center is that point. This circle is 
perpendicular to R—define a and b to be the points of intersection, chosen such that 
there exists a path from a to z; to z2 to b that traverses the circle either clockwise or 
counterclockwise. All three of these constructions are depicted in Figure 4.5. 

Now, suppose that we choose two elements y, ¢ € Isom(H7’) such that y(z1) = 
$(z1) = i, and y(z2) = it, 6(z2) = it for some tj, > 1. Letgy = wod™!. 
Then g(i) = i, and g(itz) = it. Consider the line x = 0. It is perpendicular to 
R—-since @ preserves R, its image must be a generalized circle orthogonal to R. It 
also passes through the points i and if2, so its image must pass through the points 
i and it,. However, there is only one generalized circle that passes through those 
points and is orthogonal to R, and that is the line x = 0 itself. Therefore, g must 
either fix both 0 and oo, or it must swap them. Since any element of Méb? (2) is fully 
determined by where it sends three points and we know that g(i) = i, g(0) € {0, oo}, 
(co) € {0, oo}, this leaves us with four possibilities for g, namely 


gi(z) =z g2(z) = —Z 
03(z) = —z7! ga(z) = ZI. 
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Since we also know that g(if2) = it; and t), t2 > 1, this rules out the second two 
possibilities, leaving just two cases. In both, g (itz) = itz, hence ty = fo. 


4.3 The Poincaré Half-Plane 


We are finally ready to construct a metric on H’ such that the corresponding isometry 
group is Isom(H/). Strictly speaking, this sole requirement is insufficient to uniquely 
determine the metric even up to scaling—see Exercise 4.5.5. So, we should ask for 
some additional nice property. There are many possible choices, but here is a simple 
one: we’ll ask that the line x = 0 acts like a line in Euclidean space, in the sense 
that if d : H? — [0, 00) is the metric, then d(ity, ity) + d(ity, it3) = d(it, it) for 
any 0 < t) < f2 < #3. This additional requirement is just about enough. We start by 
defining d on the line x = 0. 


Lemma 4.3 Let L be the half-line x = 0, y > 0. For any r > 0, there exists a 
unique function d, : L x L — [0, oo) such that 


1. d,(i, ei) =r, 

2. d-(x, y) =d,(y,x) forany x,y €L, 

3. forany0 < t) <t2 < , d-(ih, i) + d-(ih, it) = d- (it, its), and 

4. for any y € Isom(H?) such that y(L) = L, and any x,y € L, d,(x,y) = 


d,(y (x), y (Y)). 
“(2h 
Z1 


Remark 4.2 To clear up any confusion: the ‘e’ in the statement of the lemma is just 
Euler’s number. We could do the argument with e replaced with any other positive 
real number, but we will see that this is a convenient choice. 


Concretely, 


d,-(Z1,Z2) =r 


Proof If y € Isom(H”) and y(L) = L, then y fixes the set {0, 00}. In SL(2, R), 
there are only two types of matrices that do this, namely 


2 0 0 A 
01/A)’?\-1/10 


for some 1 € R*. Consequently, we know that y (z) = 4*z, —A*/z, —Z, or A?/Z. 
Note that z }» —Z fixes every point in L, so in fact if d-(x, y) = d-(y (x), y(y)) 
for y (z) = A?z and —A?/z, then it immediately follows that it is true for the other 
two. Thus, it shall actually suffice to check that d,(x, y) = d,(y (x), y (y)) for all 
x,y €L, p (2) = A2z and y (2) = —1/z. 

We'll start by determining what d, (e!/"i, e+ /"7) is forl,n € ZwithO <1 <n. 
If we take y (z) = e!/"z, then y (e!/"i) = e€+)/"j and y (el+0)/nj )= el2)/nj, 
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Fig. 4.6 From left to right, the same line segment in Hi is partitioned into smaller and smaller 
segments of equal (hyperbolic) length. 


From this, we know that d,(e!/"i, e+ )/"7) does not depend on the choice of /. This 
partitioning is illustrated in Figure 4.6. Using this, we get 
r = d,(i, ei) = d,(i, e!/"i) +d, (eli, e2/"i) +... ed, (e"—)/"F, ei) 
= nd, (el/"i, ef +)/"j), 
sod, (e!/"i, e+ )/"7) = r/n. What we have done is partition the line segment from i 
to ei into pieces of equal length, as in Figure 4.6. We deduce that d, (i, e!/"i) = rl/n 
for any /,n € Z such that 0 < //n < 1. By applying a transformation z b> tz, we 


can conclude that d, (ti, te7i) = rq for any g € QN[0, 1]. However, for any g € Q 
such that g > 1, there exists some n € N such that 0 < q/n < 1, and therefore 


d,(ti, teli) = d,(ti, tet/"i) + d,(tet/"i, te74/"i) +... +d. (te1—)/%, te4i) 

= d,(ti, tet/"i) +... +d, (e9@—Y/" (ti), e9@—Y/” (te9/i)) 

= nd, (ti, tet/"i) =n (rq/n) =rq. 
So, for any positive rational g and z € L, we know that d,(z, e7z) = d;-(e4z, z) = rq. 
If we replace z with e~7z, though, then the above gives us that d-(e~7z,z) = 
d,(z,e~4z) = rq. Thus, what we have so far demonstrated is that for any z € L and 
gq €Q4d-(, e4z) = d,(e%z, z) = r|q|. Our next objective must be to extend this so 
that it applies to all of IR and not merely Q. This can be accomplished as follows. 
Choose any t € R and any qj, q2 € Q such that gq; < t < q2. Since, 

d,(i, e@1) < d,-(i, ei) +d, (ei, ei) = d, (i, e'i) 
d,(i, e'i) < d,(i, e’i) + d,(e'i, e?i) = d,(i, e®i), 
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Fig.4.7 Each of the red segments is the image of each other under some element in Isom(H’). In 
particular, the (hyperbolic) distance between any two endpoints is exactly the same. 


we have that rjgi| = d,(i,e@i) < d,-(i,e'i) < d,(i,e®i) = r|q2|. But we can 
force g; and q2 to be arbitrarily close to tf, so we see that in fact it must be that 
d, (i, e'i) = r|t| for any t € R. From, this it follows instantly that d,(z, e’z) = r|t| 
for any z € L andt € R, which in turn can be restated as d,(z, Az) = r| In(A)| for 
any z € L, A € (0, 00). Of course, for any z1, z2 € L, z1/z2 € (0, ©), so we can 
state this as 


d,(Z1, 22) = d; (2. =22,) =r 


for all z1, z2 € L. It is easy to check that this is indeed invariant under z b> 22z and 
zt» —1/z, and so we are done. 


Having defined this function on the line x = 0, we can now uniquely extend it 
to all of H’, allowing us to compare the lengths of arbitrarily line segments, as in 
Figure 4.7. 


Lemma 4.4 For any r > 0, there exists a unique function d, : H? x H? = [0, love) 
such that 


I. d-(i,ei) =r, 

2. d(x, y) =d,(y, x) for any x, y € H’, 

3. forany0 < t) <t2 <b, d-(ih, i) + d-(ih, its) = d-(it, its), and 
4, for any y € Isom(H”), d(x, y) = d-(y (x), 7 Q)) ifx, y € A’. 


Proof We know that if there exists such a function, then for all z;,z2 € H?* with 
R(z1) = R2) = 0, 


d, (21, 22) = dr (a. =22,) =r 
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On the other hand, by Theorem 4.3, we know that for any two distinct elements 
z1,z2 € HI, there exists an element ye Isom(H’) such that y(z1) = i, and 
y (Z2) = it for some t > 1. Consequently, it must be that 


d, (21, 22) = dr (y (21), » Z2)) = 4G, it) =r [In]. 
This implies that there is at most one function d, satisfying the desired properties. 
To prove that there is such a function, we define 
d, :H? x H* > [0, co) 
0 if z] = z2 
(z1,Z2) aT if y (zi) = iy (z2) = it witht > 1, 
y €Isom(H>*). 
This function is in fact well-defined—while there are multiple elements y with the 
desired property, we know by Theorem 4.3 that they will all send z2 to the same point 
it. It is easy to see that d,(i, ei) = r and that d,(y (z1), y (Z2)) = 4; (Z1, Z2) for any 
z1,22 € H* and y € Isom(H’). It remains to check that d,(z1, z2) = d;(z2, z1). Let 
y be such that y (z;) =i and y (z2) = it. Then 
d, (22, 21) = 4, (it, i) = dG, it) = d, (21, 22). 


Thus, d, has all of the required properties. 


It would be good to have a more explicit formula for this function d,. To get this, 
we prove a simple classical result. 


Theorem 4.4 (SL(2, R) Imaginary Transformation Law) 
For any z € HH? and 


y= (: a € SL(2,R), 
S(y.z) = |cz + d|-73(z). 


Proof This is a straightforward calculation. 


5 (=**) = |ez td|-25 ((az + byez +0) 


= |cz + d|-*3 ((az + b)(cZ +.d)) 
= |cz +d|-*3 (ac|z|* + adz + bez + bd) 
= |cz + d|~? (ad3(z) — bc3(z)) = |ez + d|-73(z). 


Lemma 4.5 The unique function d, : H? x HI? — [0, 00) defined in Lemma 4.4 is 
given by 
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Fig.4.8 A plot of y = cosh(x) in blue and a plot of y = cosh~!(x) in green, dashed. 


d,: HH’ x H > [0, 00) 


_ Izo — z1(? 
hl (1+-——_ 
(1, 22) > rcos ( a) 


where cosh : R — [1, 00) is the hyperbolic cosine, defined by 
e~+e* 


cosh(x) = 5 


Remark 4.3 There is something to prove here: namely, that cosh is actually invertible 
in any sense. Indeed, it is not if we consider it over its full domain—however, if we 
restrict to (0, co), then everything goes through as it should. This is illustrated in 
Figure 4.8. For this and more about the other hyperbolic functions, see Exercises 
4.2.1 and 4.3.1. 


Proof By Theorem 4.3, we know that there exists some 


y= (: € SL(2,R) 


such that z] = y.i and z2 = y.it for some t > 1. Furthermore, we know from the 
preceding two lemmas that r In(t) = d;(z1, z2), hence tf = ef (@1.22)/"_ This means 
that 
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aitb  ait+b\? 

ci+d  cit+d 

= |ci + d|~* |cit + d|~? |(ai + b)(cit +d) — (ait + b)(ci + a)? 

= |ci + d|~? |cit + d|~* |-act + adi + beti + bd + act — adti — bei — bd|? 
= |ci + d|~? |cit + d|~* |ad(1 — ti — be — Ni |? 

= |ci + d|~? |cit + d|-? 1-0)”. 


yi- y it|? = 


2 
Iz1 — Z2| 


Here the SZ (2, R) imaginary transformation law is very useful, since this means that 
lz1 — 22/7 ci +d |? |cit + d|-? (1 — 1)? 


3(z1)5(z2) 3(y.i)3(y.it) 
_ lcit+dl?|cit+d\/ 7-1)? — d-2) 
~ lei tdl-2|cit td\-23@3ith ot 


It is now a matter of solving for tf and then for d;(z1, z2). We see that 


lz — z2/? (1-1) :( ‘) 
1 —______—_ = ] —s 
-O5@)5G@) °° «Oe aN 


1 dy (z1 529) —dr (21,29) d, 
= 5(e a is 2) = cosh (TEE) 


r 


and so the stated result follows. 


In some sense, the choice of r > 0 is arbitrary, but it is customary to set r = 1. 
There are many ways to motivate this—the most obvious at present is that with the 
explicit description of d, that we have given, it is the most convenient choice. One 
can also look at the behavior of d, close to i or, if one is familiar with Riemannian 
geometry, at the curvature. In any case, we are finally ready to define hyperbolic 
distance properly. 


Theorem 4.5 Define 
dhyper : H? x H? > [0, 00) 
= [zi — 22/7 ) 
; Sco (14+. |, 
(1, 22) ( 23 (z1)5 (z2) 


Then (H?, dhyper) is a metric space. 


Remark 4.4 This is commonly referred to as the Poincaré upper half-plane model 
of hyperbolic space. 


Proof In terms of proving that (HI, dpyper) 18 a metric space, the only thing that is 
unclear is whether it satisfies the triangle inequality. The intuitive idea behind the 
proof is to find a way to “flatten” an arbitrary triangle in H?? onto the line x = 0, as in 
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23 


24 


Fig.4.9 A visualization of the proof of Theorem 4.5. We start with three points z;, z2, z3 in general 
position. We then apply an element g € Isom(HI’) to move z to i and z» to it. Finally, we “flatten” 
y(z3) onto the y-axis in a way that can only decrease hyperbolic distance. 


Figure 4.9. To prove this formally, start with the following observation: if z1, z2 € H’, 
define w; = 3(z1)i and wz = 3(z2)i; then dhyper(Z1, 22) > Ahyper(w1, w2). Why is 
this? Well, 


lz1 — zal? |w) — wel? 
235 (z1)3(z2) ~ = 231) F(a) 
_ |, — w2/* 

23 (w1)3(w2) 


and cosh~! is order-preserving in the sense that if x < y then cosh~!(x) < 
cosh7!(y). (See Exercise 4.3.1.) So, choose any z}, z2, 23 € H’. By applying Theo- 
rem 4.3, we know that there exists some y € SL(2, R) such that y.z1 = i, y.z3 = it 
forsomet > 1; wealso know that dhyper(y .z, ) .W) = dhyper(z, w) for any z, w € H?. 
Thus, to prove that 

dhyper(Z1, z3) 5 dhyper(Z1, “a) + dhyper(Z2, 23), 


it actually suffices to prove that 


dhyper (E, it) < dpyper(E, y 22) + dhyper(y .22, if). 
Here we use the trick that we can “flatten” y .z2 onto the line x = 0—we know that 
dhyper (i, y.22) = dhyper (i, S(y.z2)i) and dhyper(y 22, 1t) = Anyper (3 (y.z2)i, it), and 
so actually it suffices to prove that for any tf > 1 and any s > 0, 
dhyper (i, it) < dhyper Gi, is) + dhyper (is, it). 


But this is immediate from the additive property of dhyper along the line x = 0. Thus, 
(H?, dpyper) 18 a metric space. 
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By construction, Isom(H*) consists of isometries of this metric space. We will 
show later, after we have a better feel for the geometry of this space, that it consists 
of all of the isometries. As a step in this direction, let’s try to understand this weird 
metric that we have constructed by thinking about what happens if we choose two 
points that are very close together. One can show that 
1 


d 
— (cosh! (x)} = — 
dx x2—] 
(see Exercise 4.3.1), so 

2x 


= (cosn-' (: + *)) = : — =s 7 
dx 2 - y2\2 7 x2 (x2 + 4) = Vaz +4" 
(1+ 5) el 


where sgn(x) = 1 if x > 0 and —1 if x < 0. This means that we can do a linear 
approximation for f(x) = cosh7!(1 + x?/2) in the region x > 0 by 


FO) +x lim, f'() =x, 


which will be roughly accurate as long as x is very small. What does this have to do 
with our metric? Suppose that z1, z2 € H? are close together, so |z1 — z2|/y ¥ 0, 
where y = 3(z,). Then 


2 
= Z1—-Z 
dpyper(Z1; Z2) = cosh t (1 + a) 


23 (z1) 3 (z2) 
_ Iz1 — z2|" 
x cosh! (1 + —_—*- 
cos ( + 2y2 
lz1—z2| 1 
Re ———— = —dpuciid(Z1, 22). 
y y 


Ah-ha! What we have determined is that if we just look at points that are quite 
close together, the hyperbolic metric is essentially just the Euclidean one, but scaled 
according to the distance above the x-axis. In particular, we see what is happening 
is that as we get closer and closer to the line y = 0, the more space is compacted— 
shorter and shorter Euclidean distances correspond to larger and larger hyperbolic 
distances. For a visualization of this, see Figure 4.10. 


> Example Let z; 4 z2 € H? be two points such that dhyper(Z1, 22) = A. Suppose 
W is an isometry of H? such that ¥(z;) =i and ¥(z2) = it for some t > 1. Find 
t asa function of 2. 

Since ¥ is an isometry, dhyper(¥ (z1), Y(z2)) = 4 by definition. Therefore, 
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Fig.4.10 All of the dashed curves are equidistant from one to the next in the hyperbolic metric. 


In (7) = In(f), 
i 


A= dhyper (i, it) = 


sot =e’. 


4.4 Circles and Lines 


Our goal in this section is to find hyperbolic analogs of Euclidean circles and lines. 
The first definition is quite straightforward, as it is exactly the original Euclidean 
definition, but with dguctia replaced by dhyper- 


Definition 4.6 For any z ¢ H* andr > 0, the hyperbolic circle with center z and 
radius r is the locus of points w € HH? such that dhyper(w, Z) =r. 


What does a hyperbolic circle look like? Let’s first work this out in the special 
case where the center is 7. 


Lemma 4.6 The hyperbolic circle with center i and radius r is the Euclidean circle 
with center cosh(r)i and radius sinh(r), where sinh : R — R is the hyperbolic sine, 
defined by 


sinh(x) = (e* - e *) : 


Nie 


Proof If w is a point on this hyperbolic circle, then 


d (w, i) cosh~! (1+ oat a 
lij= =P 
hyper (W, 25(w) > 


or equivalently, 
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Fig. 4.11 On the left, a family of hyperbolic circles with center 7. On the right, a collection of 
hyperbolic circles all with radius 1/5. The one furthest from the x-axis has hyperbolic center i /4. 


|w — i|* + 2(1 —cosh(r))3(w) = 0. 
Let’s write w = x + iy. Then the above can be rewritten as 
x? — 2y* + y? +14 2(1 — cosh(r))y = 0, 
which we recognize as the equation of a Euclidean circle. Determining the center 


and radius of this circle is just an exercise in completing the square, which I leave as 
an exercise. (See Exercise 4.2.4.) 


This is enough to determine what hyperbolic circles are like in general. 


Theorem 4.6 Hyperbolic circles are Euclidean circles (but with different centers 
and radii). 


Remark 4.5 To better understand the differences, some families of hyperbolic circles 
are shown in Figure 4.11. 


Proof If C is the hyperbolic circle with center z and radius r, then we can choose an 
element g € PSL(2, R) such that g(z) = i. Then @(C) is the hyperbolic circle with 
center i and radius r, which we know from the previous lemma is just a Euclidean 
circle. We know that linear fractional transformations preserve generalized circles, so 
that means that C = y~!(g(C)) isa generalized circle contained inside H?—which 
must simply be a Euclidean circle. 


Marvelous! Next, we give the same treatment to hyperbolic lines. It is less obvious 
how to define these. One potential suggestion would be to say that a line should be 
the shortest path between two points z;, z2 € Hi’, as measured with respect to the 
hyperbolic distance. This is possible to make sense of but complicated. We will 
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Fig.4.12 Pairs of points in the hyperbolic plane and the hyperbolic lines that correspond to them. 


instead proceed as follows: recall that a line in the Euclidean plane can be thought 
of as the locus of points equidistant from two fixed points z;, z2—that is, all w € C 
such that dguctia(w, Z1) = dEuclia(w, 22). This fact was used in the proof of Theorem 
2.10, the Algebraic Description of Generalized Circles. 


Definition 4.7 A hyperbolic line in Hi? is a locus of points in H? of the form 


[vem 


dhyper(w, a= dhyper(w, z2)| > 
for some fixed points z} 4 z2 € Hl”. 


Some examples of such loci are depicted in Figure 4.12, suggesting that they are 
really just Euclidean circles of some kind. As before, our method to actually prove 
this is to first work out a simple case. 


Lemma 4.7 Let z; 4 z2 € H? such that z; = —Z3. The corresponding hyperbolic 
line is x = 0 (restricted to H). 


Proof By definition, w € H is on this line if and only if 
Z |w — zi ) 
d w, Z1) = cosh! (1 + ——-~_ 
hyper ( 1) ( 23 (w)5(z1) 
|w = zal? 


= -1 
= cosh (: + 25 (w)5 (Za) 


) = dhyper(w, 22), 


or equivalently if 
Jw — zl? |w — za)? 
3(z1) 3(z2) 
If zj = —Z2, then 3(z1) = 3(z2), so this further simplifies to just |w—z1| = |w—z2\. 


This is just the equation defining the Euclidean line of points equidistant from z; and 
Z2—that line is just x = 0. 


It is immediate from this lemma that any generalized circle orthogonal to the real 
line must be a hyperbolic line—x = O is simply the special case where it is orthogonal 
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at 0 and oo, but since we know that Isom(HI*) allows us to freely move points on 
the boundary but preserves circles and angles, we get the desired conclusion. What 
is less obvious is that any hyperbolic line has to be of such a form, and we couldn’t 
possibly get any pathological counterexamples. To prove that, we need to show that 
we can move arbitrary points in Hi into the special position of Lemma 4.7. We will 
do it using a sequence of lemmas. 


Lemma 4.8 Let z; 4 22 € Hl’. There exists a unique generalized circle passing 
through both z1 and z2 which is orthogonal to the real line. 


Proof Choose g € PSL(2, R) such that g(z1) =i and y(z2) = it for some t > 1; 
since y preserves generalized circles, R, and angles, there exists a unique generalized 
circle with the desired properties if and only if there exists a unique generalized circle 
through i and it which is orthogonal to R. If C is a generalized circle with inversive 
coordinates (x, x’, €), then it is orthogonal to R if and only if 


1 x EV (0 i"? 
W((E)(0i) )-a0=8 


which is to say that its bend-center is real. On the other hand, if C is a circle passing 
through i and if, then its center is x + it/2 for some x € R, so its bend-center is 
non-real. Therefore, if C is a generalized circle that passes through i and if and is 
orthogonal to R, then it is a line—specifically, it has to be the line x = 0. 


Lemma 4.9 Let C be a generalized circle orthogonal to the real line and z € HH? 


be a point on C. There exists a unique generalized circle passing through z which is 
orthogonal to both the real line and C. 


Proof Choose any other point z’ € H* on C and an element g € PSL(2,R) such 
that y(z) = i and g(z’) = it. It is clear that (C) is the line x = 0. If we can 
show that there exists a unique generalized circle through i which is orthogonal to 
both x = 0 and y = 0, we will be done. Under what circumstances is a generalized 
circle orthogonal to both x = 0 and y = 0? This happens exactly when it is a circle 
centered at the origin. (See Exercise 4.2.5.) Such a circle passes through 7 if and only 
if it is the unit circle. 


Lemma 4.10 For any z; 4 z2 € Hl’, there exists 9 € Isom(H) such that o(z1) = 
—9 (22). 


Proof We proceed geometrically; the essential steps are depicted in Figure 4.13. 
First, draw a hyperbolic line C through z; and z2— that is, a generalized circle orthog- 
onal to the boundary of H?. We know there is a unique such circle thanks to Lemma 
4.8. Next, choose a point w on this line such that dhyper(w, 21) = dhyper(w, 22). Why 
must there be such a point? In principle, we could try to directly solve for it, but 
it is easier to consider a path p(t) from z; to z2 along C, such that p(0) = z; and 
p(1) = Z2. Consider the function 
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And 
a 


(d) 


Fig. 4.13 A visualization of the proof of Lemma 4.10. We start with a pair of points z1, z2 € H? 
in (a). In (b), we draw a hyperbolic line through this pair and find the midpoint w between them. In 
(c), we draw another hyperbolic line C’ perpendicular to the first through the midpoint and choose 
another point w’ on this perpendicular line. In (d), we choose an element in 6 € PSL(2, R) which 
sends the midpoint to i and the other point on our perpendicular bisector to it. This forces us into 
the desired configuration. 


fO= dpyper (p(t), Zz) dhyper (p(t), 22). 


Clearly, f(0) = —dhyper(z1, 22) < O and f(1) = dhyper(Z1,Z2) > 0. Therefore, 
there has to exist some 0 < t < 1 where f(t) = O—but that is exactly a point 
such that w = p(t) is equidistant from z; and z2. Using Lemma 4.9, construct a 
generalized circle C’ through this point w orthogonal to both R and C. Now, choose 
any other point w’ € H? on C’ and an element g € PSL(2,R) such that g(w) =i 
and g(w’) = it for some t > 1. This forces y(C’) to be the line x = 0 and g(C) to 
be the unit circle. Therefore, g(z1) = eA, o(z2) = e!® for some 0 < 01,0. <@. 
Since w was equidistant from z; and z2, i is equidistant from g(z1) and g(z2), which 
is to say that 


-1 ale if? -1 jee = il? 
cosh 1 + ——— ]= cosh 1 + ——— ], 
2 sin(O,) 2 sin(@2) 


or equivalently, 
tt — 3)? ie — ;|? 


2sin(,) 2 sin(@) * 
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This, too, we can simplify, since |e’? — i|? = 2(1 — sin(6)), and so we see that in 


fact when all is said and done we must have sin(@;) = sin(@2). The only way for this 


to happen is if 02 =  — 61, hence y(z1) = e!7! = —(—e!1) = —@(z2). 
We see that we have proved the following. 


Theorem 4.7 Hyperbolic lines are generalized circles orthogonal to the real line 
(restricted to Hi"). Furthermore, they have the following properties. 


1. For any z, 4% 22 € HY’, there exists a unique hyperbolic line passing through 
them both. 

2. For any hyperbolic line | and a point z € I, there exist a unique hyperbolic line 
I' that is orthogonal to | at z. 


Proof If lis ahyperbolic line defined by a pair of points z;, z2, we know by Lemma 
4.10 that we can choose g € Isom(H’*) such that g(z;) = —g(z2), hence (I) is 
the line x = 0 by Lemma 4.7. As we discussed before, this is sufficient to prove 
that hyperbolic lines are exactly the generalized circles orthogonal to R. The two 
additional properties are simply the statements of Lemma 4.8 and 4.9 in this new 
language. 


Before we conclude this section, I want to point out something about our defini- 
tions: specifically, it is quite obvious that you could extend both of them to any metric 
space whatsoever, given that they are entirely phrased in terms of the metric dhyper. 
Indeed, both of these definitions do show up in the metric geometry literature, but 
under different names. What we have called a hyperbolic circle is in general known 
as a sphere; similarly, what we know as a hyperbolic line is in general known as a 
hyper-plane. The difference in terminology is simply because the hyperbolic plane 
is “low-dimensional” in a sense; indeed, one sees that in (R3 » Thyper), What we have 
termed a circle will be a sphere and what we have termed a line will be a plane. (See 
Exercise 4.2.6.) 


p> Example Let C,,- be the hyperbolic circle with center Ai and radius r, where 
A,r > 0. Find the Euclidean center and radius of C, as functions of 4 andr. 

We know that the hyperbolic circle Cj. with center i and radius r has Euclidean center 
cosh(r)i and radius sinh(r). If 


then y.i = Ai, and therefore y.C; is the hyperbolic circle with center Ai and radius 
r—that is, C,;. But, of course, that just means that C; is the Euclidean circle with 
center A cosh(r)i and radius / sinh(r). 
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Fig.4.14 The geometric intuition behind the proof of Theorem 4.8. Given a point z € HI? (drawn 
in purple in both examples), there are circles passing through it with centers at i, ei, and zg; these 
circles intersect in a unique point. Thus, if the radii of the circles are known, the point z is specified 
uniquely. Two examples of this process of triangulation are shown. 


4.5 Isometries and Geometric Notions 


It is time to make good on our promise of showing that all isometries of HI* are 
contained in the group Isom(H’). 


Theorem 4.8 Isom(H?, dnyper) = Isom(H’). 


Proof We have already shown that Isom(H*) C Isom(H?’, dhyper). Itremains to show 
that if Y € Isom(H*, dhyper), then Y € Isom(H?). Let z} = Y(i) and z2 = V(ei). 
Then there exists g € PSL(2,R) such that y(z1) = i and g(z2) = it for some 
t > 1. In fact, since 


dhyper (i, ith= dhyper (Z1, 22) = dhyper (i, el), 


w(2) smn (2) 


and since t > 1, it must be that t = e. Therefore, ¥ = g o © has the property that 
(i) =i and P(e) = ei. Note that ¥ € Isom(H7’) if and only if is. Therefore, 
without loss of generalization, we may assume that Y’ fixes both i and ei. Next, 
choose any point zo € H? to the right of the line x = 0. Define r) = dhyper(Zo, i) 
and r2 = dhyper(Zo, ei). Since is an isometry, it must be that 


1 = dhyper(Z0, 4) = Ahyper (YF (Zo), ‘Y()) = dhyper(Y (Zo), #): 
similarly, dhyper(Y (zo), e7) = r2. Now, draw the hyperbolic circles centered at i and 
ei with radii r; and r2, respectively. Both zo and (zo) must lie on both of these 
circles; however, as hyperbolic circles are Euclidean circles, we see that there are 
exactly two intersection points. Moreover, by symmetry, these two intersection points 
are swapped by the map ¢(z) = —zZ. Therefore, either (zo) = zo or (bo P) (Zo) = 
zo. Since (f o M)(7) = i and (¢ o VP) (ei) = ei, we see that we may assume without 


we see that 


=|In@)| =1= 


> 
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loss of generality that ¥(@ij) = i, V(ei) = ei, and ‘¥ (zo) = zo. I claim that actually 
this forces Y(z) = z identically, which of course means that ¥ € Isom(H”) as 
claimed. 

Why is this? We can just do triangulation, as in Figure 4.14. For any point z € H’, 
we may use the same argument as for zo to conclude that either ¥(z) = z or 
W(z) = —Z. However, since zo is to the right of the x = O line, either dhyper(z, Zo) < 
Anyper(—Z, 20) OF dhyper(Z, 20) > dhyper(—Z, Zo0)—they cannot be equal. Since Y 
is an isometry, it must preserve this inequality. So, for instance, if dpyper(Z, Z0) < 
Anyper(—Z; 20), then dhyper(¥ (z), 20) < dhyper(¥ (—Z), Zo). But this can only occur 
if Y(z) =z. 


We will do a more careful classification of isometries of H? in the next chapter; 
for now, we will gainfully employ the fact that we know what the isometry group is 
to define various geometrical terms. 


Definition 4.8 The orientation-preserving isometry group of H? is Isom°(H*) = 
PSL(2, R). Any isometry of H? is either orientation-preserving (if itis in Isom? (H*)) 
or otherwise orientation-reversing. 


There is no reasonable way to define the orientation for arbitrary metric spaces, 
but we already know that Isom(H7’) splits neatly into these two pieces, so this is 
entirely sensible. 


Definition 4.9 Let p;, p2 : [0,1] — HQ? be two differentiable paths such that 
pi(1) = p2() = Zo. The (hyperbolic) angle of intersection at zo between these 
two paths is the Euclidean angle of intersection. 


Remark 4.6 We know that all of the isometries of Hi? preserve Euclidean angles of 
intersection, so the definition we have chosen works perfectly well: in particular, we 
know that hyperbolic isometries preserve hyperbolic angles, exactly as one expects. 


Definition 4.10 A (hyperbolic) polygon is a connected subset of H?* bounded by a 
finite number of (hyperbolic) line segments, referred to as its sides. The vertices of 
the polygon are the intersection points of the sides. The angles of the polygon are 
the angles of the intersection at each of the vertices. 


While all of these definitions are just like the Euclidean plane, their behavior in the 
hyperbolic plane is not always analogous. The reader might recall call, for example, 
that the sum of the angles of a Euclidean polygon with n sides is (n — 2)z. Not so 
for hyperbolic polygons, as one can see from Figure 4.15! 


4.6 The Poincaré Disk Model 


One slightly frustrating characteristic of the Poincaré half-plane model is that most 
of the geometry is “off at infinity’—we can only see a tiny fraction of the whole 
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Fig. 4.15 A tiling of the hyperbolic plane by pentagons. Note that the angles of each pentagon 
have measure 7/2, so their total sum is 2.57. 


hyperbolic plane. What if we could represent it in a different way that didn’t have 
this problem? We know that there exists a linear fractional transformation g such 
that g (HI) is, say, the unit disk, so there is nothing preventing us from changing the 
underlying metric space. This would still have the highly desirable property that the 
isometries of this new space would still be Mobius transformations, with everything 
that implies. Before we work out the details, there is an alternate characterization of 
the hyperbolic metric that will be very useful. 


Theorem 4.9 Let z1, z2 € H? be distinct points. Let | be the hyperbolic line between 
them. Let a,b € OH? be the intersection points of | with the boundary. Then 


dhyper(Z1, 22) = |In ([z1, 22; a, b))I . 


Proof We know that elements of SL (2, R) acting as linear fractional transformations 
preserve both the cross-ratio and the hyperbolic distance; moreover, they can move 
any two points z;,z2 € H? toi ,e'i where t = dhyper(Z1, 22). Thus, it shall suffice 
to prove the theorem for this case, where it is easy to check that a and b are 0 and 
oo, although not necessarily in that order—however, it doesn’t matter whether we 
choose a = 0 and b = oo or vice versa. Indeed, 


lin (Li, ei; 0, o0])| = |In (e")| = 1 
lin (Li, "4; 00, 01)| = |In (e~)| = 1, 


proving the theorem. 


This is very interesting, because the cross-ratio is preserved by all elements in 
PSL(2, C), and not just PSL(2, R). This motivates the following definition. 


Definition 4.11 The Poincaré disk model of hyperbolic space consists of the unit 
disk 


D* = {ze C|lz| < 1} 
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Fig.4.16 An illustration of the transformation from the Poincaré half-plane to the Poincaré disk. 


equipped with a distance function which we shall call, by abuse of notation, dhyper : 
D? x D* + [0, oo) and defined as follows. If z} = z2, then dhyper(z1, Z2) = 0. 
Otherwise, choose a generalized circle C passing through z; and zz and orthogonal 
to the boundary—let its intersection points be a and b. Then 


dhyper(Z1, 22) = |In ([z1, 22; a, b])|. 


The boundary 6D is the unit circle. 


Remark 4.7 The existence of a unique generalized circle through z; and z2 orthog- 
onal to the boundary is proved just as it was for H?. 


Theorem 4.10 (Isometric Isomorphism Between the Poincaré Half-Plane and 
the Poincaré Disk) The Poincaré disk ( a dhyper) is a metric space. Moreover, 


WY: H? > D* 
iz+1 
zti 


Zhe 
is an isometric isomorphism. 
Remark 4.8 See Figure 4.16 for an idea of how to gradually morph one space into 


the other while preserving the hyperbolic metric. 


Proof Consider the linear fractional transformation y(z) = (iz + 1)/(z +7). Since 
w(—1) = —-1, w(0) = —i, and w(i) =i, y sends dH, oriented from left to right, 
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to the unit circle, oriented counterclockwise. In particular, ‘Y, the restriction of yw to 
HI, is a bijection from H? to D?. For any two points z;, z2 € Hi’, since Y preserves 
generalized circles and angles, it will send the generalized circle passing through them 
and orthogonal to the real line to the generalized circle passing through ‘¥ (z1), ‘Y (z2) 
and orthogonal to the unit circle. It will also send the intersection points a, b with the 
old boundary to the intersection points ‘¥ (a), ‘¥(b) with the new boundary. Finally, 
YY preserves the cross-ratio and consequently, 


Anyper(Z1, 22) = [In ([z1, 22; a, b))| = [In (¥(Z1), P2); Pla), PO)D| 
= dhyper(P (Z1), W (23); 
From this, we can conclude that ( . dpyper) 1S a metric space—after all, if the triangle 


inequality held in Hi’, then it must now also hold in D?—and ¥ is an isometric 
isomorphism. 


Corollary 4.2 (Explicit Formula for Distance in the Poincaré Disk) 
Let z1,Z2 € D?. Then 


Z 2\z1 — zal" 
dhyper(Z1, 22) = cosh7! [1+ : 
oi (1 — |z112) (1 = |z2l?) 


Proof We know that ®(z) = (z +1)/(iz + 1) is an isometric isomorphism between 
D? and H?—1his is just the inverse of ‘¥. Therefore, 


235 ((Z1)) 5 (P&2)) 


From there, it is an easy calculation. (See Exercise 4.2.7.) 


®(z1) — O(z2)/* 
dhyper(Z1, Z2)= dhyper (®(z1), O(z2)) = cosh”! (: a IP) (22)! ) 


Since we can take the isometric isomorphism between H* and ID? to be a linear 
fractional transformation, all of the geometric notions that we had defined for H 2 
apply equally well in D7. In particular: 


1. Hyperbolic circles in D? are loci of points centered around a fixed point at a fixed 
distance—these are always Euclidean circles, but with shifted centers and radii. 

2. Hyperbolic lines in D? are loci of points equidistant from two as points—these 
are always generalized circles orthogonal to the boundary 0 

3. The isometry group Isom(D”) splits into two pieces: the orientation-preserving 
subgroup Isom? (D”) and the orientation-reversing transformations. 

4. The hyperbolic angle between any two intersecting paths in D? is simply the 
Euclidean angle between them—this is always preserved by the isometries. 


Since everything in H* corresponds neatly with its counterparts in D*, we think of 


both of them as describing the same geometric object—concretely, the hyperbolic 
plane—with H? and D? simply being different models of it. Thus, we may view 
geometric arrangements as in Figure 4.17 as having counterparts in either model. 
There is an overarching paradigm behind this. 
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Fig.4.17 An arrangement of lines, circles, and points in the hyperbolic plane. On the left, this is 
viewed in H?; on the right, it is viewed in D?. 


Philosophical Principle 


Consider isomorphic geometries as simply different descriptors of the same under- 
lying mathematical object. Use whichever viewpoint (i.e., isomorphic geometry) 
is most convenient for solving the problem you are currently facing. 


It will be easier to work with the Poincaré disk model if we can better under- 
stand its isometry group. We know that Isom°(D*) must consist of linear fractional 
transformations. Just as we did for the Poincaré upper half-plane, we would like a 
description of these transformations in terms of some nice matrix group. 


Definition 4.12 For any two natural numbers m, n, let diag(m, n) be the (m+n) x 
(m +n) diagonal matrix (i.e., all off-diagonal entries are zero) such that the first m 
diagonal entries are 1, and the other n are —1. The indefinite unitary group U(m, n) 
consists of all (m+n) x (m+n) complex matrices M such that M' diag(m, n)M = 
diag(m,n). The special indefinite unitary group SU(m, n) consists of all matrices 
in U(m, n) with determinant | or, equivalently, it is the intersection of U(m, n) with 
SL(m+n,C). 


For any m,n, SU(m,n) is a group under matrix multiplication (see Exercise 
4.2.8); it is deeply tied to the theory of Hermitian forms, but we won’t pursue this 
point of view. In our case, we are just interested in the group SU (], 1). 


Lemma 4.11 


sud, 1) = (5 ) 2SLa, | 


Proof We know that by definition, 


M= (: ‘) € SL(2,C) 
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is in SU(1, 1) if and only if Mdiag(1, -1)M = diag(1, —1). After this, it is a 
simple computation that 


ay\/f1 0 ap\ (a-7y\(ap 
Bi)N\o-1) \y 6) =\B-a) \y 
_(la?-ly? ap-yo) _ (1 0 
~ \ Ba-dy \pP-leP} \o-1)° 
If 6 = Gand y = B, it is easy to see that this equality holds, so we simply need to 
show the converse. Since ad — By = 1, |a|?6 —a@By = G. But Jal? = 1+ |y|? 
and af = yd, whence @ = (1 + |y |*)5 — yoy = 6. We know that |d|* 4 0 since 


|5| — ||? = 1. Therefore, since 7 — 76 = (B — 7)6 = 0, we can safely conclude 
that B = 7. 


Theorem 4.11 (Accidental Isomorphism Between SL(2, R) and SU (1, 1)) The 
map 


W : SL(2,R) > SU(I, 1) 
 t A ot 
wh wf ve wl 


Proof We first need to show that this map really does send elements of SL(2, R) to 
elements of SU (1, 1). Indeed, if M € SL(2, R), then 


Y(M) (( =) ¥(M) 


is a group isomorphism. 


If we write 


then one checks that 
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(4 EC) 4) 

v2 v2 2 /2 

( 0 ee) € ) 
i(ad — bc) 0 73 z 
(oF 2)-62) 

i aE ~\o-1)° 
SU(1, 1) > SL(2,R) 


1 ix! 1 i 
un (4 Y) u( 3 %) 
2 V2 2 V2 


is the inverse map. To be precise, this is the inverse map of 'P if it is well-defined, 
but it is far from obvious that this map really does send elements of SU (1, 1) to 
elements of SL(2, R). However, this really is so, since 


1 _i\7! toi 
(4%) ¢)(4 2) 
=i oe We Nee 


We claim that 


= ( R(a) + 3(B) F(a) + RA) 

—I(a) + R(B) R(a) — 5(B) 
This shows that Y’ is a well-defined bijection. That it is a group homomorphism is 
easily confirmed; hence, it is a group isomorphism. 


) € SL(2,R). 


Theorem 4.12 The map 
SU(A, 1) > Isom (D*, dyer) 


ates) 
Ba “ Br+@ 


is a surjective group homomorphism; two matrices M, N map to the same isometry 
if and only if N = 4M. Moreover, for any ¢ € Isom(D~), either ¢ € Isom? (D*) or 
¢ 0 conj € Isom(D?), where conj(z) = Z. 


Proof | leave this one to the reader. (See Exercise 4.2.9.) 


> Example Find the orientation-preserving isometries of H? that fix the point i. 
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We could solve this problem directly by solving i = (ai + b)/(ci + d). There is, 
however, an easier way. The image of i under the standard isometric isomorphism 
H? — D? is (i? +1)/( +i) = 0. So, we can instead look for orientation-preserving 
isometries of D? that fix 0. Consider y = 0 and x = 0; these are both hyperbolic 
lines passing through 0. They intersect the boundary at +1 and +i, respectively. 
If ¥ is an isometry of D* fixing 0, then the image of y = 0 and x = 0 must be 
generalized circles passing through 0 and orthogonal to the unit circle, which is 
to say that they must be lines through the origin. Furthermore, we know that the 
angle between them has to be preserved, which means that they are just rotated by 
some fixed angle 6—in particular, ¥(1) = e’? and (i) = e!97. There is only one 
linear fractional transformation sending 0 +> 0, 1 b e! a and i +> e%;: it must be 
that ¥(z) = ez. That is, the orientation-preserving isometries of D? that fix 0 are 
precisely the Euclidean rotations around the origin! Now, we have to translate this 
observation back to H?. We have that WY (z) = U.z, where 


ei /2 0 
U= ( 0 wn) € SU(1, 1), 


so we can use the accidental isomorphism between SU (1, 1) and SL(2, R) to get 
the corresponding transformation 


1 PENSE yg: 1 i id io io id 

a So i0/2 = -— 1 1 13 1; 

J2 J2 e f 0 2 v2) _ oe = zee One : eae 
i L 0 ei9/2 i 1 ~Ti. # 1: -@ 1, 1 ig 
J2 V2 J2 2 ghee gL ze 7 + 5e 


_ { cos(@/2)  sin(@/2) 
~ \—sin(@/2) cos(@/2)} ° 


Therefore, we conclude that the orientation-preserving isometries of H? that fix i are 
those of the form 
cos(9/2) sin(@/2) es cos(9/2)z + sin(@/2) 
—sin(@/2) cos(@/2)} ~~ — sin(@/2)z + cos(@/2)’ 


for some 6 € R. 


> Example Let C be the hyperbolic circle in D* centered at 0 and with radius r. 
Find the Euclidean center and radius of C as functions of r. 

We already proved that hyperbolic circles are Euclidean circles previously, but now 
we can give a nicer proof of that fact. If 


2\z|? 1 2 
dpyper(Z, 0) = cosh7! (: + Iz ) = cosh! ( + |z| ) = 


1 = |z|? 1 = |z|? 
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then 
1+ |z|? 
—,~ =cosh(r), 
TP (r) 
so 
es cosh(r) — 1 
~ cosh(r) + 1’ 


which is a Euclidean circle centered at 0. Since 


Pg ee 
et /2 + —) 


tanh(r/2)” = ( 


e’+e"—-2 

e+eT4+2 

cosh(r) — 1 

cosh(r) + 1’ 

the radius of this circle is tanh(r/2), where tanh(z) = sinh(z)/ cosh(z), which is the 
hyperbolic tangent. 


4.7 Quaternions 


I made a promise earlier in this chapter that we would find a nice metric space such 
that Mob(2) will be its group of isometries. We shall do this in the next section, when 
we shall introduce three-dimensional hyperbolic space. Before that, we require an 
interlude to talk about something which might seem completely unconnected from 
what has come before in this chapter. The rough reason for this is the following: com- 
plex numbers are very useful for describing two-dimensional hyperbolic space, as we 
have seen. To try to expand two-dimensional hyperbolic space to three-dimensional 
hyperbolic space, it thus makes sense to try to expand the complex numbers and look 
at some sort of larger algebraic structure. 

The question of how to do this became of great interest to mathematicians in 
the 19th century and was ultimately resolved by the Irish mathematician William 
Rowan Hamilton defining the quaternions in 1843! . Figure 4.18 shows his portrait, 
taken when he was in his mid-50s. Hamilton was seeking some extension of the 
complex numbers with a well-defined notion of addition and multiplication which 
would have nice properties. Originally, his efforts concentrated on finding some such 


‘It’s worth noting that Carl Friedrich Gauss independently defined the quaternions in 1819, but 
Gauss had a bad habit of only publishing results that he felt were very important and well-presented. 
As a result, this as well as many other of his findings were only published decades after his death 
in 1855, with the credit going to mathematicians who did publish. 
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Fig. 4.18 William Rowan Hamilton, circa 1860. 


operations on triples of real numbers (x, y, z), analogous to how complex numbers 
can be represented by pairs of real numbers (x, y). What he did not know was that— 
assuming some very reasonable restrictions on what this operation had to be—this 
was impossible. This was the Frobenius theorem for real division algebras, which 
wouldn’t be proved until 1877, a decade after Hamilton’s death. 

The profound realization that pushed Hamilton to a more fruitful avenue came 
in 1843, when he was walking with his wife along the Royal Canal in Dublin. 
Hamilton was so energized by this revelation that he immediately took out his knife 
and carved the equation i? = j* = k* = ijk = —1 onastone of the Broom Bridge. 
Sadly, the original inscription has not survived, but there is to this day a plaque there 
commemorating this moment in mathematical history. 

What did Hamilton’s equation mean? His idea was that rather than looking at 
triples, one should instead look at quadruples of real numbers (x, y,z,t) € Rv. 
Furthermore, for convenience, write these as x + yi +zj +tk. The rules for addition 
are exactly what you would expect: 


Qt yi + 21g +k) + 2 + yaoi + 22j + tok) 
= (x1 + x2) + 1 + ya)i + 1 + 22)F + + t2)k. 
The rules for multiplication are a little more complicated but are mostly captured 
in this equation i? = j* = k? = ijk = —1, which tells you what to do with the 
symbols i, j, and k. To get the full rule-set, so to speak, you need the following 
additional information. 


1. For any real number r and quaternion x + yi+zj+tk,r@~+yitzt+tk)= 
(x + yi tzj +tk)r = (rx) + (ry)i + (rz) j + (rt)k. (This is expected—this is 
how scaling vectors in R* normally works.) 

2. For all quaternions q1, g2, 93, g1(g293) = (9192)q3- (This is the associative prop- 
erty that we previously saw in the definition of groups.) 
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3. For all quaternions q1, 92, 93, q1(q2 + 93) = 9192 + 9193 and (q1 + 92)93 = 
q1q3 + q2q3. (This is the distributive property of multiplication, which we know 
holds true for real and complex numbers.) 


Once you know all of this, the general multiplication of quaternions can be deduced. 
Let’s try a few simple cases first. Suppose we want to know what ij is. We know that 
ijk = —1, soijk? = —k; since k* = —1,ij =k. What about ji? One might expect 
this to be k as well, but this is not so: 
ji = —JiGijk) = —j@)jk = j°k = —k. 

This was the second part of Hamilton’s insight; to get the desired extension of the 
complex numbers, one needs to turn to a multiplication that is not commutative, 
which is to say that order in which things are multiplied matters. This shouldn’t faze 
us too much—we have mostly been dealing with non-commutative multiplication 


thus far already. But in the mid-1800s, this was rather unexpected. In any case, using 
similar reasoning, one can work out that 


ijok jksi ki=j 
jis=—kkj =-iki =—j. 


(See Exercise 4.1.3.) This gives enough information to do quaternion multiplication 
in general. For example, 


(Q-it+s/Qitk) =12i+k)-iQi+k)+j2t+kh 
0) ao ea ek 
=24+k4+24+j-2k4+i 
=24+3i+j-k, 

or even 


(Xi Vit aay ka yt tay OK) 
= x1 (x2 + yoi + Z2j + tok) 
+ yyi(x2 + yoi + 22 + tak) 
+ 217 (%2 + yot + z2j + tok) 
+ t1K (x2 + yoi + 2a + bk) 
= (x1x8 = yo = 2125 = Hp) 
+ (x1y2 + yix2 + Z1l2 — t22)i 
+ (%1Z2 — yit2 + Z1%2 + ty2)j 
+ (x1f2 + yizZ2 — Z1y2 + t1xa)k. 


This allows us to give an unambiguous definition of the quaternions. 


Definition 4.13 The qguaternions 1 are the set of all formal symbols x + yi +zj + 
tk with x, y,z,¢ € R, along with two binary operations dubbed addition + and 
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multiplication -, defined by 
(x1 + yi + 21g + tk) + (x2 + yoi + 22j + tok) 
= (x1 +. x2) + O1 + y2)i + Gi t+ 22)F +H + ta)k 
and 
(1 + yi + 217 + tk) + 2 + ya + 227 + tak) 
= (%1X2 — y1y2 — 2122 — tt) 
+ (wiv + vite + ci — hee) 
+ (122 — yito + 21%2 + tH y2)j 
+ (x1f2 + y1Z2 — Z1y2 + ty xa)k. 


Remark 4.9 It would make sense to denote the quaternions by Q, but sadly this 
conflicts with the convention that Q denotes the rationals. Instead, they are usually 
denoted by an ‘H’ in honor of Hamilton. Some authors use H as the symbol, but as 
we use this for hyperbolic space, I have compromised to use 1 instead. 


Remark 4.10 This is by no means the only possible way to express the quaternions. 
An alternative construction is explored in Exercise 4.2.13. 


Quaternions were initially quite popular after Hamilton introduced them, and there 
was a push to incorporate them in physics; indeed, Maxwell wrote down his equa- 
tions for electromagnetism in terms of quaternions. After a time, this approach was 
abandoned in favor of matrices and vectors, which were championed by Heaviside, 
Gibbs, and others. Quaternions fell into partial ignominy, although they continued to 
have interest in pure mathematics. Their resurgence as objects of inquiry for applied 
mathematics came through computer science: quaternions, like matrices, can be used 
to describe three-dimensional rotations (see Figure 4.19 and Exercise 4.4.5) but they 
do not suffer from a phenomenon known as gimbal lock. 

Much like the complex numbers have complex conjugation, the quaternions have 
quaternion conjugation, defined as 


xX+yi tzjttk=x—- yi —zj —tk. 
In other words, you leave the real component of the quaternion alone and change the 
signs of the other, “imaginary”, components. Quaternion conjugation has many of 
the same properties as complex conjugation. It can be used to define the norm and 
trace of a quaternion, for example. 


Definition 4.14 For any quaternion q, its norm is |g| = qq, and its trace is 
t(q)=4q +4. 


It might not be obvious how to interpret the square root here. As it happens, quater- 
nions can possess infinitely many different quaternion square roots—for instance, 
it isn’t difficult to see that (cos(@)i + sin(@)j)* = —1 for all 9 € R. However, 
everything is completely aboveboard, because 
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(a) (b) (c) 
(d) (e) (f) 


Fig.4.19 Quaternions are convenient for describing smooth rotations. The last image (f) is a rotation 
of the first image (a) using the transformation v +> qvq7!, where g = (—5 + 4i + 2j — 2k)/7 and 
we identify three-dimensional vectors v = (x, y, z) with quaternions xi + yj +zk. The intermediate 
images (b)-(e) are also rotations using quaternions, using an interpolation between | and q. 


(x+yiteatthatyitzt+th)=(«+yitzttk)@ — yi —z —tk) 
ax py ee te? > 0. 
Thus, gq is a non-negative real number and so it has a unique non-negative square 
root. This square root, the norm, is nothing more than the Euclidean norm of the 


vector (x, y, Z, ft). Similarly, the trace has a straightforward interpretation as well, 
since 


X+yitzj+tk+x + yitzj+tk =x+yit+tzj+tk—x —yi—zj—tk = 2x, 
so it is twice the real part of the quaternion. The term “trace” might seem odd 
here because we previously used this term in reference to matrices. This is not a 


coincidence. (See Exercise 4.2.13.) There are a number of other important properties 
enjoyed by quaternion conjugation, summarized below. 


Theorem 4.13 (Basic Properties of Quaternion Conjugation) Let q, q' be quater- 
nions. Then 


1. gq =@ if and only ifg ER, 


2. lq| = lal, 
3. q = Oifand only if |q| = 9, 
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Proof \leave the first three as exercises for the reader (specifically, Exercise 4.2.17). 
We will prove the fourth property by pure computation. Let 
g=xi+yit+zj+nk q =x. + yoi + 22j + hk, 


sO 


qq’ = (1x2 — yiy2 — 2122 — th) 
+ @iy2 + yix2 + Z1f2 — 112Z2)i 
+ (122 — pita + z1x2 + f1y2)j 
+ (x1t2 + y1Z2 — Z1y2 + t1x2)k. 
Replacing qg by q’ switches the > index 1 with the index 2 in the above product. 
Replacing g with g and q’ with qg’/ switches the sign of each term with exactly one 
factor of x1 or x2. Taken together, these changes mean that 
q' @ = (x1%2 — y1y2 — 2122 — tte) 
+2291 — yon + 22h — zi) 
+ (—x221 — Yat — Z2%1 + by) J 
+ (—xott + y2zZ1 — Z2y1 — tax1)k 
= (1x2 — yiy2 — 2122 — tt2) 
— 1y2 + yix2 + Z1t2 — 1122) 
— Giza — yito + 212 + ty2)j 
= (Filet Viz — Zia + Ma)k, 
which is nothing more than qq’. Finally, knowing that qq’ = q’@, it immediately 
follows that 


Ni 


laq’| = (a9'a7’) = (a9'q'a) 
ig 1 
= (q\q'\°a)* = Iq'| (q7)? 
= Iq’ lal = lalla’. 


since at the end we are simply dealing with multiplication of real numbers, which is 
commutative. 


One quick consequence of this is that for any real numberr,rg = gr = qr =rq. 
A second is that if |g| 4 0, then we may consider the quaternion 7/|q|*, and since 


qa/\q" =\aP/a? =1=7/lal’q. 
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we see that this is g~!—that is, any non-zero quaternion is invertible. This means 
that one can cancel non-zero quaternions like one can with complex numbers—for 
example, if gq’ = qq” and gq £0, then qg’ = q”. However, the reader should be left 
with two very important warnings. 


1. Cancellation can either be done on the left or on the right, but one cannot mix one 


and the other. For instance, it does not follow from ij = —ji that j = —/. 
2. For complex numbers, (zw)~! = z~!w7!. This is not true for quaternions—one 
must instead use (gq’)~! = q’~'q7!. Indeed, (ij)~! = —k = (—j)(-i) = 


7 1 i7- l : 

> Example Any vector v € R? can be identified with a traceless quaternion 
as follows: if V = (x, y,Z), then the corresponding quaternion is xi + yj + Zk. 
Therefore, we can write any quaternion in the formt +v for somet € R andv € R°. 
Given V1, V2 € R, express V\V2— that is, their product as quaternions—in terms 
of standard vector operations. 

Write vj = (x1, y1, 21) and V2 = (x2, yo, Z2). Then 


Vivo = (xi t yi + 21k) + (eat + yoy + 22k) 
= —(x1x2 + yi y2 + 2122) + Qiz2 — y2z1)i 
+ (x2Z1 — 4122) f + &1y2 — x2y1)k. 


The real component is easy to recognize—this is vq - V2, the dot product. The rest is a 
little more obscure: it is the cross product v; x v2. Therefore, vj v2 = —V1-V2+V] XV2. 


4.8 Hyperbolic 3-Space 


Hyperbolic 3-space is to the hyperbolic plane what 3-dimensional Euclidean space 
is to the Euclidean plane. There are many different models of hyperbolic space. For 
example, one can picture it by taking an open ball and defining a distance function that 
measures points farther from the origin as being farther apart—this is the Poincaré 
ball model, which is an analog to the Poincaré disk model. Another option is to build 
an analog of the Poincaré half-plane model, as follows. Take the set of all points 
(x, y,z) € R° with z > 0—this is the upper half of three-dimensional Euclidean 
space. Then, define a distance function on this set that measures points with smaller 
z-component as being farther apart—this is the Poincaré half-space model. Other 
models also exist, but for our purposes, we will stick with the Poincaré half-space 
model as it will be the easiest to define and picture. So, with this introduction, we 
define 


WP = [xt oie 


z>ol, 
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LON) 


Fig. 4.20 The Poincaré half-space, together with a collection of hyperbolic lines (to be defined 
later) rendered in red. 


where we choose to identify R? with a subset of the quaternions in much the same 
way that we identified R* with C. The important thing to figure out here is what the 
right distance function for this space should be. Recall that for points z, w € H?, we 
had 


dnyper (21 Z2) = cosh! (1+ 2-2) 
pee ws 23 (z1)3(z2) J 


It, therefore, seems reasonable to guess that the following definition is likely sensible. 


Definition 4.15 The Poincaré half-space model of three-dimensional hyperbolic 
space consists of the set HI? together with the distance function 


hyper : H? x H? > [0, 00) 
- lai — qo" 
(41, q2) +» cosh : (1 + —————_ ]],, 
27 j(q1)7j (92) 


where zr ; (q) is the j-th component of q—e., ifg¢ = x+yit+zj+tk, then 2 ;(q) = z. 
The boundary OH? of H? is CP!. 


Figure 4.20 illustrates what this space looks like. The definition we have chosen 
is exactly right—in principle, we could justify this by some argument similar to that 
from Section 4.3. It is perhaps not immediately clear that this is a metric space, but 
we shall prove that later. One of the advantages of defining HH? this way is that there 
is a simple way of describing how matrices in SL (2, C) transform it. 


Definition 4.16 For any 


ab 
M= ( a) € SL(2,C), 


and any quaternion g € H, define 
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ee (aq+b)(cq+d)! ifg 4-d/c 
sa ee) otherwise. 


Notice that since —d/c € C, which does not belong to H°, we are always guar- 
anteed that if g €¢ H?, then M.q is a well-defined quaternion. What is not obvious is 
that this quaternion is again in H, but this is nevertheless true. 


Lemma 4.12 For any q € HP? and any M € SL(2,C), M.q € HP. Furthermore, if 
7 ;(q) is the j-th component of q and 


ab 
Mi ( ‘) ; 
then m;(M.q) = |cq + d|-*x; (q). 


Proof We can write g = a + rj for some complex number a and some positive 
real number r. Note that for any complex number /, £j = jf. After that, we first 
compute 


leq +d\? =|ca+d+erj[? =|ca+d|? +r? el’, 
where we have used the basic fact that |(x, y, z)|? = |(«, y)|? + 2’. Then, 
leq + d|?(M.q) = (aq + b)(cq + d) 
= (aa +b+raj)(ca +d —rjc) 
= (ac|a|?+bea-+bd+r7ac) +rajca +rajd — raaje —rbjc 
— (ac|q|? + bea + bd) +racaj +radj —raacj —rbcj 
= (at\q\° + bea + bd) + r(ad — be)j 
= (ac|q|* + bea + bd) + rj. 


Note that this has no k-component and that the j-component is unchanged. 


One question we might have is whether (MN).q = M.(N.q)—we saw that this 
is how it worked when we defined how SL(2, C) acted on CP!. Indeed, this still 
holds for H?. 


Lemma 4.13 For any y, 72 € SL(2, C) and q € Hl’, (y172).¢ = y1-(2.q). 


— fa by _ (ar bo 
Ney di 72 Ney dy)’ 


Proof Let 


Then 
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(y192).4 = ((ajay+b1c2)q+(ayb2 + bi d2)) (Carey t+e2d1)q + (b2c1 + did2))' 
and 
r1-(72.q) = 11. ((arg + b2)(erq + dr) 1) 

= (a) ((aog + b2)(crq + d2)~') +1) 
- (c1 ((aag + b2)(crg + d2)“!) + 1) 
= (ai (ang + b2) + bi (erg + d2)) (Cxq + da)! 
((c1(aog + br) + di (erg + ah) (2g +ch)"!) 
= (a; (ang + b2) + bi (crg + d2)) (c1 (ang + b2) + di (erg + d2))! 
= ((aja2 + bic2)q + (aib2 + bid2)) 
(are, + crd1)q + (b2e1 + did2))', 


which concludes the lemma. 


1 


The most important thing for us about the transformations g +> M.q is that they 
don’t just give transformations of HI*; they also preserve dhyper- 


Lemma 4.14 For any qi,q2 € H? and M é SL(2, C), dnyper(M.qi, M.q2r) = 
dnyper(1 ’ q2). 


Proof Recall that any element in SL(2, C) can be written as a product of matrices 


of the form 
1b 0 1 u 0 
= (01): = (2.0) = (0.0%) 


for some b € C, u € C%. If we can prove that dhyper is preserved by these basic 
matrices, then we automatically know that it will be preserved by all matrices in 
SL(2, C). In fact, we can be even more conservative in our choices of matrices, 
because 


a -(# ©) (9 1) G4") (91) 4) (91) Ae? 
““Nout} \-10) \o 1 J\-10) (01) \-10) \o 1 
= KN,-1KN,KN,-1. 
This leaves us with a straightforward computation. 


|Np-qi — No.qal" = l(qi +b) — (q2 +b)! __ la ql? 
2m j (No.1) j (No.2) 2m j (qi) j(q2) 2m j(qi)mj(q2) 


and 
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Fig. 4.21 An inversion of a sphere (drawn in blue) through the unit sphere (drawn in yellow). A 
cross-section of this inversion reveals a circle inversion. 


; = ee a 
IK.qi-K.qolh reat rae 
2m j(K.qi)aj(K.q2) 21g 1-7 |g2|-? 2 (G1) 2 G2) 
2 
= ee 
lq? I-39; +49 | Igo? lg. — ql 
7 2x j (qi) j(q2) 2m (qi) j(q2)’ 


where we make use of the previous lemma. 


For simplicity of exposition, I have chosen to employ the trick shown to allow 
us to eliminate diagonal matrices A, as basic generators that we have to consider. 
However, their action is not so complicated: they are simply compositions of rotations 
and dilations, just as they were on the plane. We will investigate this more closely in 
Chapter 5. 

In any case, this action by SL(2, C) preserves a number of other important geo- 
metric notions. To start with, define a generalized sphere to be either a sphere in R? 
or a plane union the point at infinity. 


Theorem 4.14 Let S be a generalized sphere and y € SL(2,C). Then y.S is a 


generalized sphere. 


Proof It suffices to prove this for the basic kinds of matrices 


1b 01 
= (01): = (Cia): 


where we are again making use of the trick used in Lemma 4.14 to be able to ignore 
diagonal matrices. Since Np.g = g + b is a translation, it preserves generalized 
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Fig. 4.22 The angle of intersection is shown between two intersecting spheres via their tangent 
planes. A cross section reveals that this is the angle of intersection between two circles. 


spheres. It is less clear what is happening with K.q = —q~!. However, note that 
q +> g and q +» —@ are reflections and so certainly preserve generalized spheres. 
On the other hand, the map g b> q is nothing other than an inversion through the 
unit sphere—that is, it takes a point g distance r away from the origin to |g|~7q, a 
point on the same ray, but now distance 1/r away from the origin. This is illustrated 
in Figure 4.21. Sphere inversions preserve generalized spheres; we could prove this 
in a manner similar to the way we did it in Chapter 2, but it is substantially easier 
to argue by cross-sections and symmetry. To wit, let S’ be the sphere of radius r we 
are trying to invert through the unit sphere centered at the origin. Take any plane 
that passes through the centers of these two spheres. On this plane, we get a circle of 
radius 1 and a circle of radius r, and it is easy to see that g +> 77! restricted to this 
plane inverts the circle of radius r through the circle of radius 1. We already know 
that the result of this will be a generalized circle. Since every cross-section of the 
image of S is such a generalized circle, and since the bends of all of these circles are 
the same, we conclude that the image of S' must be a generalized sphere. 


It is also true that the transformations given by elements of SL(2, C) preserve 
angles. We will describe this in a manner analogous to the methodology from Chapter 
2: specifically, the angle between two generalized spheres is the angle between their 
two tangent planes at any of the points of intersection, as in Figure 4.22. That this 
angle is the same regardless of the choice of intersection point follows from rotational 
symmetry. 


Theorem 4.15 /f S;, S2 are generalized spheres that intersect at an angle 0 and 
y € SL(2, C), then y.S1, y.S2 also intersect at an angle 0. 


Proof Once again, we only need to prove this for the matrices 


1b 01 
= (01): €= Cia): 


Translations preserve angles, as do all reflections, so we only need to consider 
whether the sphere inversion g +> g~! preserves angles or not. Indeed it does, 
by across-sectional argument: take the plane P through the origin and the centers of 
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Fig.4.23 Copies of the hyperbolic plane inside of three-dimensional hyperbolic space. 


S, and Sz. The intersection of P with S; and Sz gives two circles that intersect at an 
angle @. A sphere inversion through the origin preserves this plane P and behaves 
like a regular circle inversion when restricted to P—consequently, the image under 
q +> 77! will give two generalized spheres intersecting at an angle 0. 


We could also prove that SL (2, C) preserves orientation, but I leave this as a task 
for the reader to ponder on their own, as we will not need it here. We are actually 
ready to show that (H? » hyper) is a metric space. 


Theorem 4.16 Let G be the intersection of any generalized sphere orthogonal to 
CP! and H°?. Then (G, dnyper|g) is isometrically isomorphic to (H?, dnyper). AS a 
consequence, (H, dnyper) is a metric space. 


Remark 4.11 This result gives some insight into why we call H? three-dimensional 
hyperbolic space—as we will see shortly, the generalized spheres G play much the 
same role as planes in Euclidean space. We know that if we look at any plane in 
Euclidean three-dimensional space, we get Euclidean two-dimensional space. Just 
so, if we look at any of these generalized spheres G, we get a copy of hyperbolic 
two-dimensional space on it, as in Figure 4.23. 


Proof The boundary of G is a generalized circle in CP!. We know that any such 
circle can be moved to the line x = O by a transformation z +» y.z for some 
y € SL(2, C). The extension of this to H* via g + y.q will move G to the upper 
half of the xz-plane, which we shall refer to as Y—this follows as we know that such 
transformations preserve both generalized spheres and angles. Furthermore, we know 
that this transformation preserves dhyper. Therefore, if we know that (Y, Anyper| y) 
is isometrically isomorphic to (H?, dhyper), then it will automatically follow that 
(G, dnyper| g) is as well, because (Y, dhyper 7 and (G, dnypee| G) are related by the 
isometric isomorphism g +> y.g. Why is the choice of Y particularly nice? Well, if 
we write down the hyperbolic distance restricted in this plane, 
dnyper|y : ¥ x ¥ > [0, 00) 


lai +217) — @2+ ah) 


(x1 + z1j, x2 + z2j)  cosh7' {1+ 
: : 2(x1 + 21 /)@2 + 22/) 
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it isn’t difficult to notice that this is exactly the hyperbolic distance in H?—all that 
we have done is renamed i to j! Therefore, the map x + yi > x + yj is an isometric 
isomorphism between (H?, dpyper) and (Y, diypet| y): 

Why does it follow immediately that (H, dhyper) is a metric space? Well, for 
any three points g1, q2, g3 in H, they lie on some generalized sphere orthogonal to 
the boundary. Call the intersection of that sphere with HI* to be G—we know that 
(G, Aisper| G@) is a metric space and therefore in particular 


1. dhyper (91, q2) = dhyper (92; 41); 
2. dhyper(Gi, q2) = 0 if and only if gq; = gz, and 


3. dhyper (41, q3) < dhyper (41, q2) + dpyper (G25 q3)- 


But since g1, q2,q3 were completely arbitrary, we see that Hl? satisfies all of the 
properties of a metric space. 


We now know that (H?, dpyper) is a perfectly kosher metric space; moreover, we 
know that PSL (2, C) is a subset of Isom(H*). Furthermore, it’s not hard to see that 
all of M6b(2) is in Isom(H?): any element in it can be written as either z +> $(z) 
or z+» $(—Z) for some ¢ € PSL(2, C). But 


a 2 
~9i, -g2)) = cosh! 1 =a + 4| 
hyper (—91, —42)) ( + on (Gia) (a) 
so q +> —q is an isometry of HI*. We conclude that every element in M6b(2) can be 
understood as an isometry of hyperbolic 3-space. That there are no other isometries is 
less clear, but we will show this once we have defined some basic geometric notions 
in H?. 


) = dpyper (915 q2), 


4.9 Hyperbolic Spheres, Planes, and Isometries 


We extend various definitions that we had for the hyperbolic plane to HI. To start, 
let’s consider spheres. 

Definition 4.17 The hyperbolic sphere with center p and radius r in Hl is the locus 
of points g € Hl? such that dhyper(q, Pp) =r. 


Theorem 4.17 Hyperbolic spheres are Euclidean spheres (but with different centers 
and radii). 


Proof Let S be a hyperbolic sphere centered at a point g € H*. Consider the planes 
that pass through q and are orthogonal to CP !. Each of them is essentially a copy of 
HH? by Theorem 4.16, and so the set of points q’ in that plane that are a fixed distance 
from gq is a Euclidean circle. Furthermore, all of these planes are related by a rotation 
around the vertical line through q’, and so the full set in H? is what you get by taking 
such a Euclidean circle and rotating it around that line as the axis—in other words, 
it is a Euclidean sphere. 
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Fig.4.24 Some concentric hyperbolic spheres which have been cut away slightly for easier viewing. 


Figure 4.24 shows a family of concentric hyperbolic spheres to better illustrate 
this description. Next, we consider hyperbolic planes. 


Definition 4.18 A hyperbolic plane in HH? is a locus of points in H> of the form 


[a <n? 


dpyper (9, p= dhyper (4, »} > 


for some fixed points p, 4 p2 € Hi. 
We can characterize what hyperbolic planes look like with the aid of a lemma. 


Lemma 4.15 For any p; 4 pz € Hl”, there exists an isometry ¢ € PSL(2,C) such 
that 6(p|) = j and $(p2) = tj for some t > 1. 


Proof There exists a Euclidean plane E that passes through p; and p2, and which is 
orthogonal to the boundary. By applying a translation g +> q + p, we may assume 
that this plane passes through the origin. Since E M Hi? is an isometric copy of H’, 
we know that there exists y € Isom(H7’) such that y(p|) = j and w(p2) = tj for 
some f > |. 


Theorem 4.18 Hyperbolic planes are generalized spheres orthogonal to the bound- 
ary (restricted to H? ). Conversely, any generalized sphere orthogonal to the boundary 
is a hyperbolic plane. 


Proof Choose any two points pj 4 p2 € H*. By Lemma 4.15, we know that 
there exists 6 € PSL(2, C) such that d(p1) = j and d(p2) = tj for some t > 1. 
Choose any Euclidean plane E’ that passes through 0, j, and tj—it will automatically 
be orthogonal to the boundary, so we know that it is isometrically isomorphic to 
HH’. Therefore, we know that the set of points g € E’ such that Anyper(9, J) = 
dhyper(q, tj) is ahyperbolic line—specifically, itis a circle centered at 0, and its radius 
is uniquely determined by ft. Note that this does not depend on E’, which means that 
the hyperbolic plane P defined by j and tj must be a Euclidean sphere centered at 
0. (This is easiest to see from a picture: see Figure 4.25.) However, y~!(P) is the 
hyperbolic plane defined by p; and p2; since Isom(HI*) preserves both angles and 
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(a) (b) (c) 


Fig.4.25 A visualization of the proof of Theorem 4.18. We start with two points 1, 2, through 
which we draw a plane orthogonal to the boundary in (a). In (b), we translate the plane so that it 
passes through the origin and move these two points to j and tj; at this stage, it is clear that the 
restriction of the hyperbolic plane to this plane is a circle centered at 0. Finally, in (c), we observe 
that we have rotational symmetry, so the hyperbolic plane is a sphere. 


generalized spheres, we see that y~!(P) is also a generalized sphere orthogonal to 
the boundary. 

As for the converse, note that any generalized sphere orthogonal to the boundary is 
uniquely determined by its intersection with the boundary, which is a circle. For any 
two circles C;, C2, there exists an element ¢ € PSL(2, C) such that 6(C,) = C2. 
Therefore, we can get any generalized sphere orthogonal to the boundary as the image 
under an element of PSL(2, C) of a hyperbolic plane. But elements in PSL(2, C) 
are isometries, so any such image is also a hyperbolic plane. 


The proof of this theorem suggests another natural object that we should consider. 


Definition 4.19 A set / C H? is a hyperbolic line if there exists a hyperbolic plane 
P such that / C P and / is a hyperbolic line inside P (where P is considered as a 
copy of Hi? in the usual way). 


The awkward part of this definition is that it seems like it might depend on which 
particular plane P we choose. It does not. 


Theorem 4.19 For any 1 C Hl, the following are equivalent. 


I. lis a hyperbolic line. 

2. lis the non-trivial intersection of two hyperbolic planes. (By non-trivial, we mean 
that the intersection is neither empty nor the union of the two planes.) 

3. Lis either a line or a half-circular arc orthogonal to the boundary CP}. 


Proof Suppose! is ahyperbolic line contained inside some plane P. It must intersect 
the boundary at two points a,b € CP!. Choose any ¢ € PSL(2,C) such that 
$(a) = O and ¢(b) = co. Then ¢(P) must be a Euclidean plane through the origin 
orthogonal to the boundary, in which case ¢(/) is the vertical line through the origin. 
Choose any other Euclidean plane E that passes through the origin and is orthogonal 
to the boundary. Evidently, #(/) is the intersection of E and ¢(P), which means that 
/ is the intersection of —'(£) and P. Therefore, if / is a hyperbolic line, then it is the 
non-trivial intersection of two hyperbolic planes. Conversely, if / is the intersection 
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Fig.4.26 A visualization of the idea behind Theorem 4.19: the general case of the intersection of 
any two hyperbolic planes (shown on the left) can be transformed to the specific case where they 
are both planes through the origin (shown on the right). 


of two hyperbolic planes, we can use an isometry ¢ € PSL(2, C) to move one of 
these planes to a Euclidean plane FE through the origin orthogonal to the boundary, 
in which case ¢(/) has to be either a line or a half-circular arc orthogonal to the 
boundary—that is, a hyperbolic line. However, if f(/) is a hyperbolic line in EZ, then 
| is a hyperbolic line in #~!(E£). The construction of this line can be seen in Figure 
4.26. 

On the other hand, since hyperbolic planes are either spheres or planes orthogonal 
to the boundary, / is an intersection of two such planes if and only if it is either a line 
or a half-circular arc orthogonal to the boundary. 


Note that it is immediate from this theorem that if / is a hyperbolic line inside of 
one plane, then it is a hyperbolic line in any plane that contains it. This is the same 
as it is in Euclidean space, as is the following characterization. 


Theorem 4.20 For any two distinct points p,, py € Hl, there is a unique hyperbolic 
line that passes through them. For any three distinct points p,, p2, p3 € HH’, there 
exists a unique hyperbolic plane that passes through them. 


Proof For any pj # p2 € Hi, by Lemma 4.15, there exists an isometry ¢ € 
PSL(2, C) such that 6(p1) = j and ¢(p2) = tj for some t > 1. There is a unique 
hyperbolic line that passes through these two points—this is the vertical line through 
the origin. If we also have a third point p3, then ¢(p3) will be some point off this line, 
and there is exactly one hyperbolic plane that passes through all three of these points: 
this will be the Euclidean plane through the origin, orthogonal to the boundary, and 


passing through $(p3). 


This give us enough information to cleanly prove that Isom(H°) = Méb(2). 


Theorem 4.21 (Characterization of Isometries of H*) 
If © € Isom(H, dnyper), then either ¥(q) = y(q) for some y € PSL(2,C), or 
Y(q) = -—w(q) for some wy € PSL(2,C). 
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Fig.4.27 A visualization of the proof of Theorem 4.21. In the first image, we see that measuring 
distance to a point (drawn in red) from j and ej narrows down its location to a circle which is the 
intersection of the hyperbolic spheres centered at j and ej. In the second image, we see the effect of 
adding a third point i + j from which to measure distance: there are now only two possible points. 
These two points are reflections across the plane through 7 and j, which is illustrated in the last 
figure. Adding a fourth point not on this plane allows one to determine the point uniquely. 


Proof We already know that the given transformations are isometries; the difficulty 
is in proving that they are the only ones. Choose any ¥ € Isom(H”) and consider 
the points py = V(j), p2 = WV(ej). By Lemma 4.15, we know that there exists 
d € PSL(2, C) such that 6(p1) = j and d(p2) = tj forsomet > 1—indeed,t = e. 
Therefore, without loss of generality, we may assume that ¥(j) = j and Y(ej) = 
ej. For any point p € Hl, define Tp,1 = dhyper(j, P) and rp.2 = dhyper(ej, p)- 
Furthermore, let S, | be the hyperbolic sphere with center j and radius r, 1; let S,.2 
be the hyperbolic sphere with center ej and radius r, 9. Clearly, 


1. the intersection of S,; and S,\2 is a (Euclidean) circle centered on a point on 
z-axis and parallel to the xy-plane, 

2. p lies on the aforementioned circle, and 

3. ‘¥(p) also lies on this circle. 


We want to show that any rotation around the z-axis can be obtained by an isometry 
in PSL(2, C). Indeed, 


ef/2 0 
( 0 wn) € SL, C), 


and for any p = z+f¢j, 


i0/2 
(‘ 7 tr) paelhg ei? ages 
is the desired rotation. So, we can find some rotation ¢g € PSL(2,C) as above 
so that 6(¥@ + j)) = i+ j, but 6(j) = J and g(ej) = ej. So, without loss of 
generality, we may assume that '¥(j) = 7, ¥(ej) = ej, and ¥V@i+/) =i+ 7. Now, 
for any p € H’, define rp,3 = Anyper(Gi, Pp) and S,3 to be the hyperbolic sphere 
with center q) and radius rp3. We see that 


1. the intersection of S,.1, Sp,2, and S,,3 is one of two points, 
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2. these two points are reflections of each other across the plane through j, ej, and 
i + J: 

3. the reflection across the aforementioned plane is given by p +> —p, and 

4. p,'¥(p) must be one of these two points. 


Now, choose any point g not in the plane through i, ej, i + j. We have shown that 
either '¥(q) = g or ¥(q) = —q. By composing with p > —p if need be, we can 
assume that P(j) = j, V(ej) = ej, MG + jf) =i+ j, and P(g) = q. Now, we 
can use the same triangulation technique that we had for H?, which is demonstrated 
in Figure 4.27. It must be that ¥(p) = p for all p € HI because the only other 
possibility would be that ‘¥(p) = —p, but that would change the distance to q. 


Corollary 4.3 The set of isometries Isom(H?, dhyper) is a group if we take function 
composition to be the operation. Furthermore, the map 


M6b(2) > Isom(EP, dnyper) 
is prey.p  ifd(z)=y.zwithy € SL(2,C) 
pt>—-7.p ifo(%) =—y.zwith y € SL(?2,C) 


is a group isomorphism. 


Proof By the previous theorem, we know that this map is surjective. Indeed, it must 
be injective, since every element in Mob(2) has a distinct effect on the boundary of 
HH. Function composition on the left maps to function composition on the right— 
therefore, Isom(H?, dpyper) 1S a group and this is an isomorphism. 
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Problems 


4.1 COMPUTATIONAL EXERCISES 


1. a) Prove that 


__ (cosh(t/2) sinh(t/2) 
‘~ \ sinh(t/2) cosh(t/2) 


) e SUC, 1) 
for allt € R. 
b) Find 7;.0. 
2. Let C be a hyperbolic circle in D? with center re’? and radius R. Find its 
Euclidean center and radius as functions in r, 0, R. (Hint: the previous exercise 


may be helpful.) 
3. Show that 


ijok jksi ki=j 
ji=—kkj =-iki=—j, 


where i, j, k are quaternions. 
4. a) Find integers a, b,c, d such that if g =a+bi+cj+dk, then 


a. Iq? =2 
b. igi? =3 
c. Iq? =5 


b) Find integers a, b, c,d such that if g = a+ bi + cj + dk, then |q|* = 30. 
(Hint: the results of the previous part are very helpful here.) 

c) Make a conjecture regarding for which integers k the equation a + b? + 
c? + d* =k is solvable in integers. 


4.2 PROOFS 


1. The three standard hyperbolic functions are hyperbolic sine, hyperbolic cosine, 
and hyperbolic tangent, defined as 


=X 


inh(x) e* —e 
sin = 
. 2 
e~te™* 
h(x) = ————— 
cosh(x) 5 
inh 
tanh(x) = mn) 


cosh(x)° 


4.9 


nABwWh 


lon 


10. 
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The domains of hyperbolic sine and cosine can be taken to be either R or C. The 
hyperbolic tangent can be defined wherever cosh(x) 4 0. 1 


a) Prove that cosh(z) = 0 if and only if z = zi(k + 1/2) for some integer k. 
b) Sketch the graphs of sinh, cosh, tanh as functions on R. 

c) Prove that cosh(x)* — sinh(x)? = 1. 

d) Prove that 


sinh(2x) = 2 sinh(x) cosh(x) 
cosh(2x) = cosh(x)” + sinh(x)?. 


e) Prove that sinh(iz) = i sin(z) and cosh(iz) = cos(z). 


. Prove that SL(2, R) is a subgroup of SL(2, C). 

. Prove that Isom(H’) is a group. 

. Complete the proof of Lemma 4.6. 

. Prove that a generalized circle C is orthogonal to x = 0 and y = 0 if and only 


if it is a circle centered at the origin. 


. We show that the metric definition of a plane coincides with the usual definition 


of a plane in Euclidean space. 


a) Show that if pj 4 p2 € R», then the set of points p3 such that 
dguclid(P1, P3) = ABuclid(P2, P3) is a plane. (Hint: use Euclidean isometries 
to reduce to a simple case.) 

b) Prove that if P is a plane in R?, then there exist pj # p2 € R® such that 
p3 € P if and only if dguctia(P1, P3) = Aeuctid(P2, P3)- 


. Finish the proof of Corollary 4.2. 
. Prove that SU (m, n) is a group under matrix multiplication. 
. Prove Theorem 4. 12. (Hint: use the fact that you already know that the isometries 


of TH? come from elements of SL(2,R).) 

Consider the line x = 0 in H?. Fix some r > 0. For any point it on the line, the 
hyperbolic line orthogonal to x = 0 at that point is the half-circle consisting of 
all points te’? withO <@ <a. 


a) Show that for any t > 0, there exists a unique point te! % to the right of the 
line x = 0 such that dhyper (it, te!) = r. What is 6,? Does it depend on t? 

b) As one varies the parameter t, what is the curve te’? Is it a hyperbolic line? 
How does the situation here differ from the Euclidean plane? 
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11. Let ®: D* > D? bea map such that for any z1, Z2, Z3 € D>, 
dhyper (Z1, 22) _ Anyper(®(z1), (z2)) . 
dhyper (Z1, z3) Anyper(®(z1), (z3)) 


that is, D is a similarity. 


a) Prove that there exists some constant c > 0 such that 
Anyper(®(z1), O(z2)) = Cdhyper(Z1, 22) 


for all z}, 22 € D7. This is known as the constant of proportionality. 

b) Prove that if c < 1 then the constant of proportionality of ®~! is greater than 
or equal to 1. 

c) Prove that the image of a hyperbolic line under a similarity is a hyperbolic 
line. 

d) Prove that the image of a hyperbolic circle with hyperbolic radius R under 
a similarity with a constant of proportionality c is a hyperbolic circle with 
hyperbolic radius cR. 

e) Consider the following configuration of lines and circles in the hyperbolic 
plane. 


That is, in D*, we consider an idealized hyperbolic triangle with all of its 
vertices on the boundary, and a circle inscribed inside of it. Prove that any 
similarity of D* must send this configuration to an idealized hyperbolic tri- 
angle with all of its vertices on the boundary, and a circle inscribed inside of 
it. Furthermore, show that there exists some isometry that will send it to the 
same idealized hyperbolic triangle and circle inscribed inside of it. 

f) Prove that for any similarity, its constant of proportionality is 1; that is to say, 
it is an isometry. (Hint: consider the configuration from before. How can the 
hyperbolic radius of the inscribed circle change under a similarity?) 


12. Consider the map 
®:C > Mat(2, R) 


: x y 
X+Ly b> Ga 
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13. 


14. 


15. 
16. 


17. 


18. 
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a) Prove that for any z1, z2 € C, O(z; +. z2) = O(z1) + B(z2) and O(z1z2) = 
®(z;)®(z2). That is, in some sense, every complex number can be faithfully 
represented by a matrix with real coefficients. 

b) Prove that for any z € C, |z|* = det(®(z)) and 2R(z) = tr(@(z)). 


Consider the map 
®:H > Mat(2, C) 


xt yi ty tikes (79%, oe) 

a) Prove that for any gi,g2 € H, ®(qi1 + q2) = (qi) + Pq) and 
®(qiq2) = P(q1)P(qgz). That is, in some sense, every quaternion can be 
faithfully represented by a matrix with complex coefficients. 

b) Prove that for any g € C, |q|? = det(®(q)) and tr(q) = tr(®(q)). 

c) Prove that g~! = 7/|q|* using the result of the previous part. 


Find an injective map ® : 71 > Mat(4, R) with the property that for all qi, g2 € 
H, ®(qi + 42) = ®(qi) + O(g2) and B(g1 G2) = O(g1) Pq). 

Prove that any quaternion q is a solution to the equation X? — tr(qg)X + |q|’. 
Prove that for any r > 0, there exist infinitely many quaternions g such that 
q° = —r, and there are only two such that g? = r. 


We fill in the gaps in the proof of Theorem 4.13. Let g be any quaternion. 


a) Prove that g = q if and onlyifg € R. 
b) Prove that |g| = |g]. 
c) Prove that g = 0 if and only if |g| = 0. 


Let q be a quaternion. 


a) Prove that if gi = ig, qj = jq, and gk = kq, theng ER. 
b) Prove that the following three conditions are equivalent. 


a geéR. 
b. gq’ = q'q for all quaternions q’ with no real component. 
c. gq’ = qq for all quaternions q’. 


4.3 PROOFS (Calculus) 


1. 


Let sinh, cosh, tanh be as defined in Exercise 4.2.1. 
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a) Prove that 


lim sinh(x) = co lim sinh(x) = —oo lim cosh(x) = oo 
X00 X—>—0O0 XO 

lim cosh(x) = oo lim tanh(x) = 1 lim tanh(x) = —1. 
x—-CO Xx—0OoO x—-—-O0O 


b) Prove that 


S neay = cosh(x) 
dx 
ae try = sinh(x) 
dx 


| ene) = cosh(x)~”. 
dx 


c) Prove that cosh(x) > 0 for all real x. (Hint: it is continuous and never zero.) 
d) Prove that sinh and tanh are strictly increasing on R—that is, if x < y, then 
sinh(x) < sinh(y) and tanh(x) < tanh(y). 
e) Prove that tanh(x) > Oif x > 0, tanh(x) < Oif x < 0, and tanh(x) = Oif 
x=0. 
f) Prove that cosh is strictly increasing on [0, oo). 
g) Prove that, considered as functions on [0, oo), sinh and cosh attain a minimum 
at x = 0, and that minimum is 0 and 1, respectively. 
h) Prove that sinh([0, co)) = [0, oo) and cosh([0, oo)) = [1, 00). 
i) Prove that the functions 
sinh : [0, co) — [0, oo) 
cosh : [0, 00) > [1, co) 
tanh : (—oo, 00) > (—1, 1) 
are all bijective. 
j) Since sinh, cosh, tanh defined on the domains above are bijective, they have 


well-defined inverses. Prove that cosh! (x) = In(x + Vx? — 1). 
k) Prove that 


2 Geen) Si, 
dx 


Conclude that cosh™! is a strictly increasing function. 
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1) Prove that the function 


p:(-—00, 0) bh R? 
t + (cosh(f), sinh(t)) 


is injective and that its image is the right half of the hyperbola x? — y* = 1. 
This is the origin of the term “hyperbolic function.” 


2. Let pj, p2 : R — HD? be paths along hyperbolic lines. Prove that if lim;—so0 
Pilt) F lim; oo p2(t), then 


lim. dhyper (pit), p2(t)) = co. 
too 


What happens if p; and p2 approach the same point at infinity? Give a geometric 
interpretation. (Hint: use an isometry to simplify to some easy to consider case.) 


4.4 PROOFS (Linear Algebra) 


1. Define O(3) to be the set of 3 x 3 real matrices M with the property that 
M’ = M~!, where M™ is the transpose of M. Define SO (3) to be the subset 
of O(3) consisting of matrices with determinant 1. 


a) Let ej, e2, e3 be the column vectors of a real matrix M. Prove that M € O(3) 
if and only if e; -e; = 1 ifi = j and is O otherwise. 

b) Prove that a 3 x 3 real matrix M is in O(3) if and only if M preserves the 
dot product, in the sense that v.w = (Mv).(Mw) for all v, w € R°. (Hint: 
write the dot product v.w = v' w.) 

c) Prove that O(3) is a group under matrix multiplication. 

d) Prove that $O(3) is a subgroup of O(3). 

e) Prove that if M € SO(3), then v x w = (Mv) x (Mw) for all v, w € R?. 


2. Prove that 1!—the set of quaternions of norm 1—is a group with multiplication 
as the operation. 
3. Prove that 


H! + su(i, 1) 
: . X+yi z+ti 
X+yitzttktr (aera 


is a group isomorphism. 
4. Define H° to be the set of quaternions with no real component, and consider the 
bijection 
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o:R?>H° 
(x,y,z) xi t yj + 2k. 
a) Prove that if g’ € H° and q € H!, then gq’q7! € H®. 
b) Prove that 
A:H! xR» R 
(q.v) > Q7' (qQv)q"") 


is a well-defined action of H! on R*—for simplicity, we shall simply write 
q.v for A(q, Vv). 

c) Letg,q’ € H'. Prove that if g.v = q’.v for all v € R°, then either g = q’ or 
q = —q'. (Hint: you might want to use the result from Exercise 4.2.18.) 


. Consider the map 


H! —> Mat(3, R) 
q+ (q.i,4.j,9-k) 
where i = (1, 0,0), j = (0, 1, 0), and k = (0, 0, 1). 


a) Prove that (q.i, g.j, g-.K) € O(3). (Hint: use the fact that (e;, e2, €3) € O(3) 
if and only if e; -e; =O0ifi 4 j, and 1 otherwise.) 
b) Prove that (g.i, q.j,g.k) € SO(Q). (Hint: use the fact that the determinant 


of (€1,€2,€3) ise; - (@2 x e3).) 
c) Prove that if M = (qi, q.j,q-k) then g.v = Mv forall v € R?. 
d) Prove that 


H! + SO() 
qt (q.i,9g.5,q-k) 


is a group homomorphism. 


. SO(3) is the group of rotations around the origin in R?, so the previous exercise 


shows that quaternions of norm | correspond to rotations. Try to prove that all 
rotations arise in this fashion. (Hint: this is a hard problem. Don’t be ashamed 
to ask a teacher for help.) 


4.5 PROOFS (Metric Geometry) 


1. 


Let (M, d) be any metric space. Let X be a subset of M. Prove that (X,d) isa 
metric space. 


4.9 
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Let X be any set. Define a function 
ddiscrete : X X X > R 


0 ifx=y 
1 otherwise. 


ioe | 


Prove that (X, daiscrete) iS a metric space. (It is usually called the discrete metric 
space.) 

Let X be some set of symbols—it could be letters of the English alphabet, or 
{0, 1}, or something else entirely. We shall call X an alphabet. A string is an 
element of X” for some natural number n. For some fixed n, and any two strings 
s,t € X", define dyamming(s,t) to be the number of entries in which s and 
t differ. Prove that (X", dHamming) is a metric space. (It is usually called the 
Hamming space in honor of Richard Hamming, who described it in 1950 as part 
of his work on error-correcting codes, which have been an important topic in 
computer science ever since.) 

We prove that for any positive integer n, the Euclidean distance 


deuclid :R"°xR’-R 
(x, y) md [x ~ yl, 


where we may define |v| = Vv’ v, turns R” into a metric space. 


a) Prove that for any x = (x1, .x2,...Xn) € R” and any /, |x| > [xj]. 

b) Prove that |x| = 0 if and only if x = 0. 

c) Givenx, y € R”, solve for c € R such that |x — cy|* = 0. (You will get some 
expression in terms of x’ y, |x|, and |y|.) 

d) Prove the Cauchy-Schwarz inequality: for all x,y € R”, |x’y| < |xllyl. 
(Hint: since |x — cy|?_ > 0, there can be at most one c € R_ such that 
|x — cy|? =0. Use this and the result of the previous part.) 

e) Prove that for all x, y € R”, |x + y| < |x| + ly|. (Hint: expand |x + y|* and 
use Cauchy-Schwarz.) 

f) Prove that (R”, deuclia) is a metric space. 


Define a function 
f:H’xH’?>R 
0 ifx=y 
(x,y) bh ; if y(x) =i, y(y) =it witht > 1 
for some y € Méb(2) such that y (H*) = Hl’. 


Define a function 
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d:H?xH’* > 


0 ifx=y 
(x,y) = aj 
2— f(x,y) otherwise. 


a) Prove that (HI*, @) is a metric space. 
b) Prove that Isom(H*, d@) = Méb(2). 


. Let (M, d) be a metric space. For any f € Isom(M, d), prove that f is injective. 
. Let (M,d) be a metric space. Consider the set of isometries Isom(M) with 


function composition as the operation. 


a) Prove that if f, g € Isom(M), then f o g € Isom(M). 

b) Prove that f o(g oh) = (f og) oh for any f, g,h € Isom(M). 

c) Prove that the identity function id : M — M satisfiesido f = foid=f 
for all f € Isom(M). Thus, (M, o) satisfies all of the properties of a group 
other than that it might not have inverses. An algebraic structure with these 
properties is known as a monoid. 


. Let (M, d) be a metric space. Let ® € Isom(M). Prove that if is surjective, 


then ®~! € Isom(M). 


® 


Check for 
updates 


Properties of Hyperbolic Geometry 


In which parallel lines have an odd 
habit of getting farther and farther 
apart. 


The previous chapter dealt almost exclusively with the problem of constructing 
various models of the hyperbolic plane and hyperbolic 3-space and showing that we 
could get the isometry groups via linear fractional transformations. This chapter will 
reap the rewards of our hard work: we will start by using the techniques we have 
developed to understand the geometry of these spaces; we will end by using these 
spaces to prove results about the groups PSL(2, R) and PSL(2, C). Thus, we will 
see that the group theory and the metric geometry feed each other, enriching both. 


5.1 Peculiarities of Hyperbolic Geometry 


In the introduction to Chapter 4, we said that hyperbolic space was the original 
example of a non-Euclidean space—that is, a geometry that satisfied all of Euclid’s 
axioms save for the last one, the Parallel Postulate. Let’s write down these axioms. 


A Straight line may be drawn between any two points. 

Any terminated straight line may be extended indefinitely. 

A circle may be drawn with any given point as center and any given radius. 

All right angles are equal. 

If two straight lines in a plane are met by another line, and if the sum of the 
internal angles on one side is less than two right angles, then the straight lines 
will meet if extended sufficiently on the side on which the sum of the angles is 
less than two right angles. 


ill aan case ae 
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Fig. 5.1 Hyperbolic counterexamples to Euclid’s Parallel Postulate. 


Why does the hyperbolic plane satisfy all of these save for the last one? Let’s go 
through them one by one. 


A straight line... This is certainly true of HI’—indeed, we showed the stronger 
result that there is a unique line through any two points in H?. (This is stronger 
since we know that we can always embed H? inside HI as a hyperbolic plane.) 

Any terminated straight line... If this is true of Euclidean geometry, it certainly 
has to be true of hyperbolic geometry, given that after applying an isometry, we 
can assume that the hyperbolic line is just a Euclidean line. 

A circle may be drawn... This is certainly true of hyperbolic geometry: we showed 
exactly how to define hyperbolic circles with arbitrary centers and radii. 

All right angles... There is some question here about what exactly is meant by a 
right angle, but it turns out to be irrelevant since angles in H* and D” are simply 
angles in Euclidean geometry. 


The last axiom, however, is false in hyperbolic geometry. This is because it is 
possible for two hyperbolic lines to initially move toward each other, but ultimately 
to diverge away! Furthermore, it is possible for two hyperbolic lines to keep getting 
closer and closer but never quite reaching each other. Examples of how this can 
happen are depicted in Figure 5.1. 

It is common in axiomatic geometry to replace the Parallel Postulate with Play- 
fair’s axiom “Given a line / and a point p not on J, there is exactly one line /’ that 
passes through p and does not intersect /.” Not surprisingly (since it is equivalent 
to Euclid’s Parallel Postulate), this statement is also entirely false for hyperbolic 
geometry, as demonstrated in Figure 5.2. In fact, for any line / and a point p not 
on /, there exist infinitely many lines /’ that pass through p and do not intersect /. 
Evidently, hyperbolic lines are shy and do not often want to meet. As an interesting 
side-note, one historical way of attempting to prove the Parallel Postulate ran as 
follows: starting with a line / and point p not on the line, consider fixing a ruler with 
one end on p and the other end orthogonal to /. As you then translate this ruler along 
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Fig. 5.2 Hyperbolic counterexamples to Playfair’s axiom. 


the line, what is the curve swept out by the other end? Naively, it might seem like it 
should be exactly the desired parallel line through p; certainly, that is exactly what 
you get in Euclidean space. But in the hyperbolic plane, this curve is not a line at 
all! (See Exercise 4.2.10.) 

On the other hand, since the other axioms of Euclidean geometry do hold for 
hyperbolic space, the ASA, SAS, and SSS triangle congruence theorems all hold for 
it, exactly as they did for spherical geometry. What is significantly more surprising 
is that there is a triangle congruence theorem in hyperbolic geometry that has no 
Euclidean analog: the AAA theorem. That is, any two hyperbolic triangles with 
the same angle measures must be congruent, in the sense that there is an isometry 
moving one to the other. We relegate proofs of these triangle theorems to the exercises. 
(Specifically, Exercises 5.2.8-13.) 

We will, however, prove a related theorem: the sum of the angles of a hyperbolic 
triangle is always strictly less than z. In fact, we want to show something stronger: 
the amount by which it is smaller is exactly the area of the triangle. Now, of course, 
there is a bit of a snag; we never defined what we mean by hyperbolic area. The 
rough idea is this: you can set up an integral for the area via a Riemann sum by 
assuming that the area of a very small rectangle in the hyperbolic plane should be 
approximately its hyperbolic length times its hyperbolic width. The exact details are 
worked out in Exercises 5.3.2 and 5.3.3, but the main facts that we are going to use 
are the following—below, A will always denote a subset of the hyperbolic plane with 
defined hyperbolic area (A). 


1. For any isometry ®, “(A) = u(®(A)). 

2. If Aj, Ao, ... are non-intersecting subsets, then yu (U; Aj) = >); H(A). 

3. Let A be the interior of an idealized hyperbolic triangle with all vertices on the 
boundary (that is, its sides are all hyperbolic lines which don’t intersect inside 
the plane but do intersect on the boundary). Then w(A) = z. 


This is enough for us to get the desired result, with the aid of a lemma. 
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Fig.5.3 An illustration of the proof of Lemma 5.1: on the left, we start with an idealized triangle 
AABC with one vertex C off the boundary; on the right, we force this triangle into a standard 
position where the claim is clear. 


Lemma 5.1 Let AABC, AA’B'C' be idealized hyperbolic triangles such that 
A, A’, B, B’ are on the boundary, but C,C' are not. If ZC = ZC’, then there 
exists an isometry ® such that ®(A) = A’, ®(B) = B’, and ®(C) =C". 


Proof There exist hyperbolic isometries ‘¥, ‘¥’ such that ¥(A) = W'(A’) = 1, 
W(C) = ¥'(C’) = 0, and 3(¥(B)), 3(¥’(B’)) > 0 in D?. We can show this as 
follows: choose any point p on the line segment between C and A. We know that 
there exists an isometry sending C to 0 and p to some | > ¢ > 0; strictly speaking, 
we proved this for Hi”, but Hi and ID? are isometrically isomorphic. That isometry, 
of course, has to send A to 1. Now, either the image of B is above the real line or it 
is below it (it cannot be on the real line, as then AABC will not be a triangle). The 
transformation z +> Z is an isometry of D* that will switch the position if required. 

So, to summarize, without loss of generality, we may assume that A = A’ = I, 
C=C’ =0, and 3(B), 3(B’) > 0. But now we see that AABC looks something 
like a wedge and, in particular, it is obvious that ZC completely determines where 
on the boundary B is. (See Figure 5.3.) Therefore, B = B’ and we are done. 


Theorem 5.1 (Lambert’s theorem) Let AABC be a hyperbolic triangle. Then the 
(hyperbolic) area of NABC is x —(a+b+c), where a, b,c are the angle measures 
of the vertices of AABC. 


Proof If we take an idealized hyperbolic polygon with n vertices all on the bound- 
ary, it can be obtained by pasting together n — 2 such idealized triangles, hence 
its area must be (n — 2)a. However, if we take such a polygon and put all of its 
vertices equidistant along the circle, then we can also split this idealized polygon 
into n isometric pieces. Each of these pieces is an idealized triangle with two ver- 
tices on the boundary and the third of angle measure 27 /n. By Lemma 5.1, all of 
these triangles are related by isometries, hence they must have the same area: 
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Fig. 5.4 An illustration of the first part of the proof of Lambert’s theorem: we decompose an 
idealized n-gon in two different ways. The first allows us to compute its area; the second allows us 
to compute the area of idealized triangles with one off-boundary angle of measure 27 /n. The last 
image shows that we can subtract off idealized triangles to compute the area of idealized triangles 
with one off-boundary angle of measure 27 m/n. 


(n — 2)a/n = a — 2z/n, to be precise. This proves the theorem for all idealized 
hyperbolic triangles with a = 0, b = 0, and c = 27 /n for some integer n > 4. 

We can then bootstrap to get the case a = 0, b = 0, and c = 2am /n for some 
integer | < m < n. We do this as follows: first paste together m triangles with 
a = 0, b = 0, and c = 2z/n at the vertex not on the boundary. This gives an 
idealized polygon with one vertex of angle measure 27m/n and m + | vertices on 
the boundary. This can be fixed easily enough by subtracting off m — 1 triangles 
with all their vertices on the boundary—how this is done is illustrated in Figure 5.4. 
This gives us the desired triangle with a = 0, b = 0, and c = 2am /n, and its area 
must be m(z — 2z/n) — (m — 1)a = a — 2am/n, as expected. This allows us 
to approximate any idealized hyperbolic triangle with two vertices on the boundary 
arbitrarily closely and so we must conclude that, in general, the area of such a triangle 
is z — c if c is the angle measure of the non-idealized vertex. 

Next, consider an idealized hyperbolic triangle with just one vertex on the bound- 
ary. Use an isometry to send this vertex to oo in the Poincaré half-plane, and then 
use isometries z +> z+ xo and z Az? to move one of the other vertices to i. Call 
the angle measure of that vertex b, and the angle measure of the other c. It is easy to 
see that if you glue to this triangle another idealized triangle with two vertices on the 
boundary and the last vertex of angle measure z —c, then together they form an ideal- 
ized triangle with two vertices on the boundary and the last vertex of angle measure b. 
Consequently, the area of our desired triangle is (a —b) —(a —(a —c)) =a —b-—c. 

Finally, consider a non-idealized triangle A ABC with angle measures a, b, c. Use 
an isometry to move A toi in the Poincaré plane, B to some point it with t > 1, 
and C to some point x + 7y with x, y > 0. Draw a vertical line / through C, and let 
a. be the angle measure between this line and BC. Then one can paste two idealized 
hyperbolic triangles to AABC: one along BC, with angle measures z — b, a, and 
0, and the other along /, with angle measures z — a — c, 0, and 0. Choosing these 
idealized hyperbolic triangles so that their sides lie on the hyperbolic line through 
AC, x = 0, and /, we get a new idealized hyperbolic triangle with two vertices on 
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xi 


Fig.5.5 Anillustration of the second part of the proof of Lambert’s theorem. The first image shows 
how to subtract off idealized triangles with one off-boundary angle to compute the area of idealized 
triangles with one vertex on the boundary. The second image shows how to carve up idealized 
triangles to compute the area of a non-idealized triangle. 


the boundary and the remaining one with an angle measure of a—this is shown in 
Figure 5.5. Therefore, the area of AA BC is 


a-—a-—(a —(a —b+a))—-—(a —(a —a-c))=a -a-—b-c, 


as we initially claimed. 


One surprising consequence of Lambert’s theorem is that no hyperbolic triangle 
can have an area more than z! This would not be strange for a space whose total 
area was finite, like a sphere, but this is not true of the hyperbolic plane. 


5.2 Decomposing via the Trace 


Now that some of the oddness of hyperbolic geometry has been illuminated, we shift 
our focus. Our goal for this section is straightforward: we aim to find a reasonable 
way of dividing up the orientation-preserving isometries of the hyperbolic plane and 
hyperbolic space into some number of easily understood families. Our main tool 
to do this is the trace of a matrix. Given any matrix in SL(2, R) or SL(2, C), we 
know how to define the trace—can this be done for PSL(2, R) and PSL(2, C)? 
On the face, this is not possible: after all, for any 6 € PSL(2, R), we know that 
there exist two different matrices +M that correspond to ¢. However, we might 
notice that switching between the two matrices only changes the trace by a factor of 
+1—therefore, the square of the trace is perfectly well-defined. 


Definition 5.1 Let F be either R or C. For any ¢ € PSL(2, F), we define tr?(¢) = 
tr(M)? where M € SL(2,F) is a matrix that maps to ¢ under the standard group 
homomorphism SL(2, F) > PSL(2, F). 
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(a) (b) (c) 

(d) (e) (f) 
Fig. 5.6 (a), (b), (c) show the effect of a hyperbolic isometry; (d), (e), (f) show the effect of a 
hyperbolic isometry conjugate to the first, but which is easier to understand. 


One of the key properties of the trace is that it is invariant under conjugation. 


Definition 5.2 We say that ¢, wy € PSL(2, C) are conjugate if there exists y € 
PSL(2,C) such that¢é=y owoy!. 


Two conjugate isometries are shown in Figure 5.6. Since we know that tr(M) = 
tr(U MU~') for any U € GL (2, C and any 2 x 2 complex matrix M, it is immediate 
that if ¢, y are conjugate, then tr?(¢) = tr?(y). Why is conjugation so important? 
We have seen it crop up countless times in the previous chapters whenever we had to 
do a change of coordinates—if y was a transformation of a space, then y o yoy! 
was the right way to express y in this new setting. So, the square of the trace is some 
property that persists even under such changes. 

Let’s start by using the trace to classify elements in PSL(2, R). Note that since 
the trace of any matrix in SL(2, R) is real, tr?(¢) > 0 for all 6 € PSL(2, R). 


Definition 5.3 We say that dé € PSL(2, C) is 


1. elliptic if 0 < tr?(¢) < 4, 
2. hyperbolic if tr?(¢) > 4, and 
3. parabolic if tr?(p) = 4. 


Obviously, any element in PSL (2, R) is one of these three, purely by definition. 
However, this split into three distinct categories is not arbitrary. We first need a lemma. 


Lemma 5.2 Let ¢ € PSL(2,C). There exists z € CP! such that $(z) = z. 
Remark 5.1 This lemma can be generalized a substantial amount using tools from 


topology. It is true, for example, that any map CP! — CP! that is differentiable, has 
a differentiable inverse, and is orientation-preserving, must have at least one fixed 
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Fig. 5.7 An illustration of a map from the sphere back to itself which is differentiable, has a 
differentiable inverse, and is orientation-preserving; it is not, however, angle-preserving like any 
element in PSL(2, C) would be. This map has two fixed points, drawn in red. 


point—this is a consequence of the celebrated Brouwer fixed-point theorem. We 
draw an illustration of such a map in Figure 5.7 (we use the fact that stereographic 
projection lets us use the sphere and CP! interchangeably), but we do not pursue 
anything so complicated. 


Proof Write ¢(z) = M.z for some 


If c = 0, the claim is obvious: ¢(co) = oo. Otherwise, we see that we are trying to 
solve 


which is really just a quadratic equation cz* + (d — a)z — b = 0. Any quadratic 
equation always has at least one solution in C. 


Theorem 5.2 (Classification of the Orientation-Preserving Isometries of the 
Hyperbolic Plane) 


For any non-identity 6 € PSL(2, R), exactly one of the following is true. 


1. ¢ is elliptic, it fixes exactly one point in the hyperbolic plane (and none on the 
boundary), and it is conjugate to some transformation z +> e!z. 

2. @ is hyperbolic, it fixes exactly two points on the boundary of the hyperbolic 
plane (and none in the plane itself), and it is conjugate to some transformation 
zee Az. 

3. @ is parabolic, it fixes exactly one point on the boundary of the hyperbolic 
plane (and none in the plane itself), and it is conjugate to some transformation 
Ze Z+x. 
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Proof By the lemma, we know that ¢ must fix at least one point in the hyperbolic 
plane plus its boundary. We also know that it can fix at most two points—if it fixes 
three, then it is the identity, due to the properties of linear fractional transformations. 
Suppose that ¢ fixes a point zo in the Poincaré plane. Extending y to act on all of 
CP, it is easy to see that ¢ also fixes Z7o—this is because 


zo = M.z9 = M.Z0 = M.Z0, 


where M € SL(2,R) is a matrix such that ¢(z) = M.z. But this means that y 
cannot possibly fix any more points in either the hyperbolic plane or its boundary. 
Therefore, we see that we really do have exactly three possibilities: ¢ fixes one point 
in the plane; ¢ fixes two points on the boundary; y fixes one point on the boundary. 
It remains to show that these three cases match with everything else attributed to 
them. 

Suppose that ¢ fixes one point Zo in the hyperbolic plane. We can find an isometry 
y € PSL(2, C) which sends the hyperbolic plane as a whole to D? and sends zo to 0 
in particular. If we define 6 = y odo y—!, then (0) = 0. We have already worked 
out what all such transformations look like in the exercise at the end of Section 4.6: 
it must be that 6(z) = e'?z for some 0 < @ < 2z. Of course, ¢ is conjugate to d 
by definition, so we just need to show that ¢ is elliptic. Indeed, ¢ corresponds to the 


matrix 
29/2 9 
( 0 A) > 


so tr?(#) = tr?(#) = 4 cos(@/2)*. Since 0 < 6 < 27, we are done. 

Suppose that ¢ fixes two points zo, z on the boundary. We can find an isometry 
y € PSL(2,C) which sends the hyperbolic plane as a whole to H* and sends 
zo +> 0, z] > 00 in particular. If we define é = y ogo y!, then (0) = 0 and 
$(00) = oo. Since J(oo) = ov, J(z) = az +b; since (0) = 0, b = 0. Such 
a transformation preserves H? if and only if a > 0, hence A(z) = 1?z for some 
A € R\{+1}, and can be represented by a matrix 


¢ 7) € SL(2,R), 


whence tr?(#) = tr?(¢) = (A + A). The only thing missing is proving that this 
must necessarily be larger than 4, which is a simple calculus argument. (See Exercise 
5.3.1.) Therefore, ¢ is hyperbolic. 

If ¢ fixes one point zp on the boundary, then by the same argument as above, it must 
be conjugate to f(z) = az +b, witha, b € R. If z # 1, then this transformation will 
also fix b/(1 — a), which contradicts the fact that it only fixes one point. Therefore, 


¢ is represented by some matrix 
1b 
01 


and tr?(p) = tr?(¢) = 4, so ¢ is parabolic. 
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This is already thought-provoking. We might analyze each of these three basic 
types of elements in PSL(2, R) and try to find their Euclidean analogs—for instance, 
it is reasonably clear that the elliptic elements are nothing more than rotations in 
hyperbolic space. Indeed, we will do precisely this in the next sections of this chapter. 
But, first, it is a good idea to see if there is a similar decomposition for PSL (2, C). 

There is, and it is very similar to the one for PSL (2, R)—we just need to add one 
other kind of transformation. 


Definition 5.4 We say that 6 € PSL(2, C) is loxodromic if tr?(¢) ¢ [0, 00). 


Remark 5.2 Some authors instead define loxodromic transformations as those where 
the square trace is not in [0, 4], which includes the hyperbolic elements. 


Theorem 5.3 (Classification of the Orientation-Preserving Isometries of Hyper- 
bolic Space) 


For any non-identity 6 € PSL(2, C), exactly one of the following is true. 

1. ¢ is elliptic, it fixes a hyperbolic line in HP (including the endpoints on the 
boundary), and it is conjugate to some transformation z +> e!°z withO < @ < 
20. 

2. is hyperbolic, it fixes exactly two points on the boundary of hyperbolic space 
(and none in HH ), and it is conjugate to some transformation z +> 1?z with > 0 
andi #1. 

3. ¢ is parabolic, it fixes exactly one point on the boundary of hyperbolic space 
(and none in HI), and it is conjugate to some transformation z +> z+ b for some 
beC. 

4. is loxodromic, it fixes exactly two points on the boundary of hyperbolic space 
(and none in HH), and it is conjugate to some transformation z +> A7e'%z for 


some 1 #A4>O0and0 <6 <2z. 


Proof We know that ¢ has to fix at least one point in C P '—without loss of generality, 
(since we only care about ¢ up to conjugation) we may assume that point is oo. Thus, 
o(z) = az+b. If a = 1, then J(z) = z + Db is obviously parabolic and only fixes 
the point oo in both H? and CP!. 

Therefore, we may assume that a # 1. In that case, ¢ has to have a second 
fixed point in CP!, namely b/(1 — a). Again, since we only care about ¢ up to 
conjugation, we may assume that this fixed point is 0, so indeed #(z) = az for some 
a # 1. Exactly one of the following must be true. 


1. a=e'® for some 0 < 6 < 2z. 
2. a = 4? for some 2 > Oand d > 0. 
3. a = Je! for some 1 4 2 > Oand0 <@ < 2z. 


In the first case, ¢ is elliptic—it corresponds to the matrix 


0/2 9 
(‘ 0 a) € SL(2,C). 


5.3 Elliptic Elements 191 


To find the fixed points in HI’, we just choose any p = z+tj where z € Candt > 0, 
and compute 


ei9/2 9 . 0/2 10/2 ai 
0 eid/2 (z+tj) =e (zZ+ter™ =e"z +14), 


which is equal to p if and only if z = 0—therefore, ¢ fixes the hyperbolic line z = 0 
3 


In the second case, ¢ is hyperbolic—it corresponds to the matrix 


( 1) € SL(2,R). 


It’s clear that ¢ cannot have any fixed points in HI? since it will send a point with 
j-coordinate t to a point with j-coordinate 171. 
In the final case, we see that ¢ corresponds to a matrix 
Ga 0 


0 icp) € SL2,C). 


If we write w = Ae!*/?, then tr?(p) = w + 1/w. Let’s suppose that tr?(¢) = r € 
[0, 00). Then w + 1/w =r, which implies that w* — rw + 1 = 0, so 


r+Jr2-—4 
2 


If r > 4, this is a positive real number, which w is not by assumption. On the other 
hand, if0 <r < 4, then Vr? — 4 is pure imaginary, which means that 


y_ttvad rova4 
jwo|? =~. =, 
which contradicts the fact that w is not on the unit circle. Ergo, we conclude that 
tr?(¢) ¢ [0, 00), hence ¢ is loxodromic. That it doesn’t fix any points in Hi? can be 
shown as in the hyperbolic case: it will map any point with j-coordinate ¢ to a point 
with j-coordinate 17+. 


w= 


5.3 Elliptic Elements 


We shall now go through each of the four types of elements in PSL (2, C) in turn, 
starting with the simplest—the elliptic elements. We know by Theorems 5.2 and 5.3 
that these are conjugate to transforms z +> e!?z, so we might suppose that the most 
natural way to think about them is as rotations of hyperbolic space. This is indeed the 
usual perspective, and many properties are shared with the more familiar Euclidean 
setting. 
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For example, suppose that we choose some elliptic g € PSL(2,R). It has a 
unique fixed point w € Hi*, so consider what happens to a hyperbolic circle C with 
center w and radius r. Since g is an isometry, g(C) must also be a hyperbolic circle 
with radius r. In fact, since g(w) = w, we see that actually 9(C) = C. Therefore, 
we have an infinite family of concentric circles C such that g moves each of these 
circles back to itself. Some examples are shown in Figure 5.8. Moreover, we can 
describe precisely what g does to the points on each of these circles (Figure 5.9). 


Theorem 5.4 Let 9 € PSL(2,R) be elliptic, with unique fixed point w € H?. 
Let C be a hyperbolic circle centered at w with radius R. There exists a constant 
k € (0, 2R] such that one of the following is true. 


1. For each point z € C, 9(z) is the unique point that is counterclockwise from z on 
C and hyperbolic distance k from z. 

2. For each point z € C, 9(z) is the unique point that is clockwise from z on C and 
hyperbolic distance k from z. 


Proof Note that everything here is expressed in ways that are invariant under isome- 
tries, so it is actually sufficient to prove what we want for g(z) = e!?z acting on D?, 
where C is then just a circle centered at 0 with some Euclidean radius 0 < r < 1. 
Since g is now just a Euclidean rotation, it is going to take move all points on C 
either clockwise or counterclockwise. The farthest it could possibly move them is 
2R, which is what it would be for antipodal points on the circle. It remains to prove 
that it moves all points by the same amount, as measured in hyperbolic space. An 


arbitrary element of C can be written as re'“ and so 
j i 0 
dhyper (Z, g(z)) = dhyper (rei*, relat ) 


= dhyper (1, rel), 


which we note depends only on g and C, but not z. 


The situation for an arbitrary element g € PSL(2, C) isnot very different, particu- 
larly as we know that any suchelement must be conjugate to anelementin PSL(2, R), 
as they are both conjugate to elements z +> e!7z. We may nevertheless want to under- 
stand what an elliptic element does to the whole of CP! oreven H, rather than merely 
H? or D?. In terms of the action on C P!, we can reason thus: z +> e’7z fixes 0 and 
oo and all other elements are rotated around in circular orbits. Since PSL (2, C) pre- 
serves generalized circles, we see that the corresponding general statement is this: if 
¢ is elliptic, then it has some fixed points z,, z2 € CP! and the rest of CP! is a union 
of an infinite family of generalized circles that are each sent back to themselves by ¢. 
On any particular circle, é moves points in an orbit counterclockwise (or clockwise, 
depending on how you look at it). This is illustrated in Figure 5.10. 

As for points in H*, we know that for any elliptic element ¢, there exists a hyper- 
bolic line in H? that is fixed by ¢. In some sense, every other point is rotated around 
this line. We can see this in one of two ways. First, we can note that ¢ is conjugate to 


5.3 Elliptic Elements 193 


S 


a 


Fig.5.8 A rotation by an angle of z/3 seen in three different coordinate frames: in the first row, it 
is seen in the Poincaré disk, with 0 as the fixed point; in the second row, it is still in the Poincaré 
disk, but with a different fixed point; in the last row, it is seen in the Poincaré half-plane, with i 
as the fixed point. In each illustration, the dashed circles are all concentric and are moved back to 
themselves by the rotation. 


some transformation z +> e!?. We previously saw that 6(z+tj) = e!z +1) for any 
z € Candt > 0, so this is a rotation by an angle @ around the j-axis. Conjugation 
will replace the j-axis with some other hyperbolic line, but the basic principle that 
¢ moves points in circular orbits around the line will remain true. 

There is a different way to get the same result, which is summarized in the fol- 
lowing theorem. 
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Fig. 5.9 An illustration of the dynamics in CP! of an elliptic element conjugate to z > e?7!/9z. 


The fixed points are drawn in purple; all other points are moved along circular paths, drawn in blue. 
The arrows show precisely where each point is moved. 


Theorem 5.5 Let d € PSL(2, C) be an elliptic element. Let | be the hyperbolic line 

in HI} that is fixed by ¢. 

1. Forany p €l, there exists a unique hyperbolic plane P, that is orthogonal to | 
at p. 

2. IP is the disjoint union of the planes P,. (That is, Py 1 Py = Gif p F p’, but 
the union of all of them is H?.) 

3. P(P,)=P» for all pel. Restricted to Py, ¢ is elliptic, with unique fixed point p. 


Proof \leave the proofs of the first two assertions to the reader (see Exercise 5.2.15). 
For the third part, choose any such plane P,: #(P,) must also be a hyperbolic plane 
that passes through ¢(p) = p and orthogonal to the fixed hyperbolic line, which is 
just to say that 6(P,) = P,. Since P, is just a copy of HH? inside of Hi, we see that 
we can think of ¢ as an isometry of P,. Specifically, it is an isometry with one fixed 
point, p, so it is elliptic. 


Note that each of the planes P, is isometrically isomorphic to HH’, so Theorem 
5.4 applies—in particular, it means that ¢ moves points inside each of these planes 
in circular orbits, as in Figure 5.10. 
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Fig.5.10 An illustration of the dynamics on HP of an elliptic element—a hyperbolic line is shown 
which is fixed by this element, together with planes orthogonal to said line. Inside each plane, points 
are moved in circular orbits, drawn in blue. 


> Example Show that $(z) = —z7! is elliptic. Find its fixed points in CP! anda 
6 €R such that it is conjugate z +> ez. 
The matrix that corresponds to ¢ is 


01 
0) € SL(2,C), 


which has trace 0—therefore, ¢ is elliptic, possessing two fixed points in CP!. By 
inspection, ++i are fixed, so they are the unique fixed points. To get this transformation 
to be conjugate to z +> e!7z, we should move these two points to 0 and oo. Thankfully, 
we already know, due to the isometric isomorphism between Hi and D? that the map 


does exactly that. So, we simply compute 


1 i 1 i\7l . 
V2 V2 0 1 Va V3 _ [i 0 
at. -10/\(—_-H | ~ No -i 
J2 J2 J2 J2 
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Fig.5.11 An illustration of a continuous transformation F : D2 —> D2? such that 0 is an attractive 
fixed point. From left to right, we have the original image, its image under F, and its image under 
F?. 


to conclude that ¢ is conjugate to z +> —z = e'” z. Thus, we see it for what it is: it 
is a half-turn around i in the upper half-plane. 


5.4 Hyperbolic Elements 


We know that any hyperbolic element is conjugate to z +> 1z for some 1 > 0. 
What is the right geometric interpretation of such a transformation? We might note 
that this is an isometry of HI and that it sends the line x = 0 back to itself. Indeed, 
it isn’t hard to see that it has to move all the points on this line in a single direction, 
either toward 0 or oo—in other words, one of these points is attractive. 


Definition 5.5 Let X be some subset of R” and consider a continuous function 
Fe: X — X. We say that xo € X is an attractive fixed point of F if for all x € X, 
limy+soo F”" (x) = Xo. 


Remark 5.3 Here, F” should be understood to mean F composed with itself 1 times, 
not F raised to some power (whatever that would even mean). 


Remark 5.4 See Figure 5.11 for an illustration of a map D? —> D? with an attractive 
fixed point. 


We can actually simplify our life a bit and assume that oo is the attractive fixed 
point. This is because 


(10) (6.2) (00) = Co's) 


so we are always free to replace A with A~!. This means that we can always assume 
that 2 > 1, in which case oo is the attractive fixed point. More generally, any 
hyperbolic element necessarily has an attractive fixed point. 
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Fig.5.12 A translation by a hyperbolic distance of about 0.19, seen in three different coordinate 
frames: in the first row, it is seen in the Poincaré half-plane, with 0, oo as the fixed points; in the 
second row, the fixed points are different; in the last row, it is seen in the Poincaré disk, with —i, i 
as the fixed points. In each illustration, the solid red curve is the line connecting the fixed points. 


Theorem 5.6 Let 9 € PSL(2,R) be hyperbolic. One of its fixed points is 
attractive—call it z, and the other zo. For every z € HH’, o(z) lies on the unique 
generalized circle passing through zo, z1, and z. Restricted to any such generalized 
circle, 9 moves points away from zq and toward z,. Exactly one of these general- 
ized circles is a hyperbolic line; restricted to said line, g translates each point by a 
constant distance that depends only on tr(@). 


Proof If z, is an attractive fixed point of g, then y(z1) will be an attractive fixed 
point of y og o w~!. Consequently, the fact that we know that g is conjugate to 
z +> 1?z for some 2 > 1 immediately implies that it has an attractive fixed point. 
Indeed, we know that g has two fixed points, but it cannot have two attractive fixed 
points. So, let zo, z1 € OH? be those two fixed points, with z, the attractive one. 
Any point z € H? will lie on some unique generalized circle C through z, zo, Z1. 
The corresponding statement for z +> Az is that every point in Hi* lies on some line 
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Fig.5.13 Anillustration of the dynamics in C P! of an hyperbolic element conjugate to z > /2z. 
The fixed points are drawn in purple; all other points are moved along circular paths, drawn in red. 
The arrows show precisely where each point is moved. 


through the origin, and z +> A2z sends all such lines back to themselves. Therefore, 
g must do the same to the generalized circles C. Given a point z € C, where will @ 
send it? It must be sent closer to z1; the corresponding picture for z +> Az is that it 
sends all elements in H? closer to oo. 

Out of all these generalized circles through zo and z1, only one of them is orthog- 
onal to 0H? and is, therefore, a line—for z +> 2z, this is the line x = 0. Let z € H? 
be any point on this line. Then 
Mti 


diopelz, 0) = diye (fi) = In = 2In(A). 


Notice that tr(g) = 4 + 1/2, so in fact dhyper(z, @(z)) is some constant that only 
depends on tr(y) and not on the choice of z € H? on the line between the two fixed 
points. 


What is the right Euclidean analog to this hyperbolic isometry? No description 
is going to be a perfect match, but we might note that restricted to the unique line 
between the two fixed points, a hyperbolic element is just a translation. Indeed, 
such transformations are typically called (hyperbolic) translations. Off the unique 
translation line, hyperbolic translations are a little weird in that they move points 
along paths that are not lines at all, but always away from one fixed point toward the 
other fixed point, as can be seen from Figure 5.12. 

It is clear how to extend this action to the whole of CP!: a hyperbolic element 
has two fixed points in CP!, one of which is an attractive fixed point; every element 
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Fig. 5.14 An illustration of the dynamics on H> of a hyperbolic element—the line connecting 
the two fixed points is drawn in red. Other paths connecting those two points and sent back to 
themselves by the transformation are drawn in purple. Arrows indicate the direction in which the 
transformation moves points along these curves. 


other than the two fixed ones is moved along generalized circles through the two 
fixed points, toward the attractive fixed point. This is shown in Figure 5.13. We can 
easily extend to H? as well, as in Figure 5.14. 


Theorem 5.7 Let 9 € PSL(2,C) be hyperbolic. Let zo, z1 € OH? be the fixed 
points of 9, where z, is the attractive fixed point. Let | be the line between them. For 
every point p € Hi, sends the generalized circle through z0, 21, p back to itself, 
and moving p away from zo and toward z,. Restricted to |, 9 is a translation: it 
moves every point away from zo and toward z, by a fixed distance that depends only 
on tr(). 


Proof | leave this proof to the reader. (See Exercise 5.2.16.) 


5.5 Parabolic Elements 


The astute reader might have found it surprising that we labeled hyperbolic elements 
as analogs of Euclidean translations where there is seemingly a more natural can- 
didate: the transformations conjugate to z +» z+ zo, which is to say the parabolic 
elements. After all, these elements are Euclidean translations. Counter-intuitively, 
parabolic elements are absolutely nothing like Euclidean translations—indeed, there 
isn’t a Euclidean analog for them at all. The reason for this is that while elliptic 
elements can be characterized as those which fix a point and sent circles/spheres 
centered at that point back on themselves, and hyperbolic elements can be charac- 
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Fig.5.15 Families of horocycles in the Poincaré half-plane and disk. 


terized as those that map a particular line back on itself, parabolic elements preserve 
an entirely different kind of curve/surface. 


Definition 5.6 Let z be a point on the boundary of the hyperbolic plane. A horocycle 
is a curve that is orthogonal to every line in the plane that passes through z. 

Similarly, if z ¢ OH, then a horosphere is a surface that is orthogonal to all lines 
in HI? that pass through z. 


Some examples of horocycles are drawn in Figure 5.15. Intuitively, they are like 
circles “at infinity.” Indeed, if one were to replace z € @H? with z € H’, then we 
would exactly be describing a circle. (See Exercise 5.2.1.) As depicted in Figure 
5.16, they can be obtained as “limits” of hyperbolic circles as well. They can also be 
characterized in a different way that is easier to visualize. 


Theorem 5.8 (Characterization of Horocycles) Horocycles are exactly the gener- 
alized circles that are tangent to the boundary of the hyperbolic plane. 


Proof Suppose that we are working in the Poincaré half-plane model and that we 
take z = oo. What are the hyperbolic lines that pass through this point? Well, these 
are just all of the vertical lines. What are the curves that are orthogonal to all vertical 
lines? Horizontal lines! And a horizontal line is nothing more than a generalized circle 
that is tangent at infinity to the real line—i.e., the boundary. However, we know that 
all of our isometries preserve both angles and generalized circles. Furthermore, we 
know that we can always find an isometry that will take oo to any other point on the 
boundary, in either the Poincaré half-plane or the disk. Therefore, all horocycles are 
generalized circles that are tangent to the boundary of hyperbolic space. Conversely, 
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Fig.5.16 A horocycle envisioned as the limit of a family of hyperbolic circles. 


for any generalized circle that is tangent to the boundary of hyperbolic space, we can 
use an isometry to move that tangency point to oo, at which point this circle becomes 
a horizontal line—i.e., a horocycle. 


Theorem 5.9 (Characterization of Horospheres) Horospheres are exactly the gen- 
eralized spheres that are tangent to the boundary of hyperbolic space. 


Proof The idea is the same as before: we can reduce to the case where the chosen 
point is z = oo. The lines that pass through this point are simply the vertical lines, 
hence the surfaces that are orthogonal to such lines are the horizontal planes. The 
horizontal planes are exactly those generalized spheres that are tangent to 0H at oo. 


Here’s the kicker: parabolic elements are exactly those isometries that send horo- 
cycles/horospheres back on themselves. 


Theorem 5.10 Chooseanyg € PSL(2, R). The following properties are equivalent. 

1. 9 is parabolic. 

2. There exists a horocycle H such that p(H) = H. 

3. There exists z € OH? such that for all horocycles H tangent to the boundary at 
z, Q(A) = H. 


Proof The third condition implies the second. We can show that the second condition 
also implies the first. This is because if H is a horocycle, then it is tangent to 0H at 
some one point z, and since g preserves OHI* and tangency, the fact that p(H) = H 
implies g(z) = z. Therefore, g has at least one fixed point on the boundary—can it 
have more? If it does, then it is hyperbolic, and there is some line passing through z 
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Fig.5.17 Parabolic elements conjugate to z+» z— 1/3, seen in three different coordinate frames: 
in the first row, it is seen in the Poincaré half-plane, with oo as the fixed point; in the second row, 
the fixed point is 3/2; in the last row, it is seen in the Poincaré disk. In each illustration, horocycles 
tangent to the fixed point are drawn in purple. 


along which g simply translates points. But this line has a unique intersection with 
H, so this contradicts the fact that gp(H) = H. Therefore, g has only one fixed 
point—it is parabolic. 

To finish the proof of the theorem, it shall suffice to prove that if g is parabolic, 
then there is some point z € OH such that g preserves all horocycles tangent at that 
point. Since g is parabolic, we may assume without loss of generality (because we can 
always conjugate everything if need be) that g(z) = z+zo, which certainly preserves 
all the horocycles that are tangent at oo, as these are just horizontal (Euclidean) 
lines. 


This action of parabolic elements on families of tangent horocycles can be seen 
in Figure 5.17. The same is true in Hl? as well—we simply have to replace tangent 
families of horocycles with tangent families of horospheres. 
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Fig.5.18 Anillustration of the dynamics in CP! of a parabolic element conjugate to z +> ¢+1/2. 
The fixed point is drawn in purple, as is the family of generalized circles tangent at that point which 
is preserved by the action of this element. Arrows show the direction of the flow. 


Theorem 5.11 Chooseanyg € PSL(2, C). The following properties are equivalent. 
1. is parabolic. 

2. There exists horosphere H such that g(H) = 

3. There exists z € OH? such that for all horospheres H tangent to the boundary at 


va o(A) — 


Proof \ leave this proof as an exercise to the reader. (See Exercise 5.2.17.) 


The wonderful thing is that these results allow us to immediately understand the 
dynamics of parabolic elements on hyperbolic space. 


Theorem 5.12 Let 9 € PSL(2,R) be parabolic. Let zo € OH? be its unique fixed 
point. 


1. Forany w € HV’, there exists aunique horocycle that passes through both w and zo. 
2. Restricted to any horocycle H tangent to the boundary at zo, 9 moves points on 
H by a fixed distance either clockwise or counterclockwise. 


Proof We can reduce to the case where zo = oo and g(z) =z +b forsomeb € R. 
In this case, y = 3(w) will be the unique horocycle that passes through z and w and 
g either moves all points on this curve to the left, or to the right. Furthermore, 

dhyper(w, p(z)) = dnyper(w, w+b)= dhyper(S (w)i, S(w)i +b), 
which only depends on the particular horocycle H, and not w € H. 


It is easy to extend this to all of CP!: if g € PSL(2, C) is parabolic, then it has 
some fixed point zo € CP!, and there exists some generalized circle Co passing 
through zo such that 
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Fig.5.19 An illustration of the dynamics on H? of a parabolic element. 


1. g(Co) = Co, 

2. if C is a generalized circle tangent to Co at zg, then g(C) = C, and 

3. if @ is restricted to any such generalized circle C, then it moves all points along 
C, away from zo and back toward zo. 


This flow on C P! is depicted in Figure 5.18. Why isthis so? Well, any parabolic element 
is conjugate to z +» z + zo, which preserves the family of Euclidean lines parallel to 
Z = Zof—this is precisely a family of generalized circles all tangent to the fixed point 
of the parabolic element. Since such things are preserved by conjugation, this must be 
true of all parabolic elements. Of course, once we understand what happens on C P!, 
we can understand what happens on Hl’. The associated drawing is Figure 5.19. 


Theorem 5.13 Let 9 € PSL(2,C) be parabolic, with fixed point z € OH°. There 
exists a hyperbolic plane P passing through z such that g(P) = P. Furthermore: 


1. If P’ is a plane tangent to P at z, then p(P') = P’, and the restriction of 9 to P' 
is parabolic. 

2. For all p € HH, there exists a unique plane P' tangent to P at z that passes 
through p. 

3. If H is a horosphere passing through z and P’ is a plane tangent to P at z, then 
their intersection is a horocycle in P’. 


Proof Without loss of generality, we may assume that g(p) = p + b for some 
b € Cand so the fixed point is oo. Choose any point zo € C—then the Euclidean 
plane P,, passing through the points zo, zo + b, zo + j will be a hyperbolic plane 
passing through zo such that g(P,,) = Py. Hyperbolic planes tangent to P,, at oo 
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will exactly be the planes P,, for other choices of z; € C. From this, it is easy to 
see that any point p € Hl is contained in exactly one such plane—specifically, they 
will be contained in the plane P,— 7 ;(p);- Restricted to any such plane, ¢ is certainly 
parabolic—it has exactly one fixed point, oo. The horospheres passing through oo 
are just horizontal planes—the intersection of a vertical and a horizontal plane is a 
horizontal line in one of the planes P,,, which is just a horocycle. 


5.6 Loxodromic Elements 


Loxodromic elements are quite interesting in that they are the only type of trans- 
formation that does not occur in PSL(2, R) but does appear in PSL (2, C). On the 
other hand, their action on H? is not hard to understand: they are just a combination 
of a translation and a rotation along a common line. 


Theorem 5.14 Let 9 € PSL(2,C) be loxodromic, with fixed points zo, z1 € OH’. 
Let | be the unique line that intersects the boundary at zy and z,. There exists a 
unique hyperbolic element t and an elliptic element ¢ such that gp = t og, t isa 
translation along I, and ¢ is a rotation around I. 


Proof Without loss of generality, z9 = 0, z} = 00, and g(z) = re’?z for some 
1 Ar > Oand0 < @ < 22. If Tt is a translation along /, then it is of the form 
zt> r'z for some 1 4 r’ > 0; if ¢ is a rotation around /, then it is of the form 
zh ez for some 0 < 6! < 2m. Then (t 0 f)(z) = rez, which is equal to 9 if 
and only ifr’ =r, 0’ =8@. 


A consequence of this is that loxodromic elements move points in H? along infinite 
spiral paths. We can also see this as follows: let gp € PSL(2, C) be loxodromic, with 
fixed points zo, z) € OH. Choose any wy € PSL(2,C) such that y(0) = zo and 
y (oo) = z;. Then (y~! 0g 0 w)(z) = L.z for some 


re? 0 
( 0 1) € SL(2,C). 
Now, for any t € R, define 


rt eit 0 
Lis ( 6 tei) @ SLC, 0); 


It is easy to check that L'*’ = L'L* and that L! = L. Therefore, for any p € H?, 
L.(L'.p) = L'*!.p. With this in mind, for any p = z+ tj € Hl, define 


Pp(s) 1 pis rs eis (z+ tj) rset? 


= (72 +4) 
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oe 


Fig.5.20 An illustration of the spiraling paths (drawn in yellow) preserved by loxodromic trans- 
formations. The lines along which the loxodromic transformation acts as a translation is drawn in 
red. Arrows show the direction in which points are moved along. 
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Fig.5.21 A visualization of the action of a loxodromic element on CP ' On the left-hand side, the 
fixed points are 0 and oo; points are moved outward along the spirals. On the right-hand side, the 
fixed points are both points in the plane, but still one of them is attractive and points move along 
spirals from one to the other. 


This is a spiraling path in Hl, but notice that due to the way that we have defined 
it, y~! ogo yw simply moves points on Pp back to points on p,, but further up the 
spiral. An immediate corollary is that if we define g,(t) := w—'(L.p), then this is 
a spiraling path that is preserved by g. Both of these are illustrated in Figure 5.20. 
We can apply this same idea to understand the dynamics on C P!. In the special 
case where the fixed points 0 and oo, we have spirals p,(s) = r2°e7/°?z—in general, 
it will instead be some curve g-(s) = w~!(r?5e/*°z) instead. In any case, one of 
the two fixed points will be attractive, and the effect of the loxodromic element 
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g € PSL(2, C) is to move points along these spiraling curves away from one of the 
fixed points and toward the other. This is illustrated in Figure 5.21. 


5.7 Other Decomposition Theorems 


We finish this chapter by giving a few other ways that we can decompose the groups 
PSL(2, R) and PSL(2, C)—the arguments will be a mixture of algebra and geom- 
etry, using a bit of everything that we have learned. 


Theorem 5.15 Every element p € PSL(2, C) is conjugate to exactly one element 
in the set 


{zt> z+ 1}U f{az|a e C%, jal > 1} 
U fe#-|o < O< x}. 


Proof We know that any element in g € PSL(2, C) is either the identity, elliptic, 
hyperbolic, parabolic, or loxodromic. The identity is conjugate to itself and nothing 
else, so it suffices to note that it is in our defined set. If g € PSL(2, C) is parabolic, 
then we know it is conjugate to z + z+ b for some non-zero b € C. However, we 
can actually take b = 1, because 


b-/2 0 \ (1b\ (2 0 \ fi 
0 bi?) \01 QO pl/2 “Nols: 


So, every parabolic element is conjugate to z +» z+ 1, which is in our set. As the 
other elements in the set are not parabolic (they have two fixed points), g is only 
conjugate to one such element. 

Now, if g is not parabolic or the identity, then it is elliptic, hyperbolic, or lox- 
odromic. In all of these cases, we know that it is conjugate to z + az for some 
a € C*\{1}. However, since 


0 1\(a'!? o 01\'_ (a? 0 
-10 0 al?}\-10) ~\ 0 al)? 


we can always replace a with 1/a instead, which means that we can assume that 
|a| > 1. Moreover, if |a| = 1, then we may assume that a = e!? withO <0 <z, 
since 1/a = e~'?. Therefore, g is conjugate to an element in our set. Why can’t it 
be conjugate to more than one? Well, note that if it is conjugate to z +> az, then 
tr?(y) = tr?(z > az). Since 


1/2 
a ie 
( 0 2) =a +a qi 
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tr°>(g) = 2+a+1/a. So, if tr?(z H az) = tr’(y) = tr’?(z /& a’z), then either 
a=a' ora = 1/a'. If |a| # 1, then exactly one of a, a’ has norm greater than 1; 
otherwise, we use the fact that exactly one is of the form e!? with 0 <O<z. 


Corollary 5.1 (Jordan Decomposition) For every matrix M € SL(2,C), there 
exists U € SL(2,C) such that M = UJU™!, where J is one of the matrices in the set 


oi) elo2) =} 


Remark 5.5 This is just Jordan’s normal form theorem, as applied to SL(2, C). 


—— 
LU 


Proof Define g(z) = M.z. By the previous theorem, there exists y € PSL(2, C), 
which we shall write as y(z) = U.z for some U € SL(2,C), such that g = 
woy oy! forsome y € PSL(2,C) of a special form. To be precise, y (z) = J.z 
for some matrix J in the set defined in the statement of the corollary. But this means 
that M = +U JU. Since —J is the aforementioned set if and only if J is, we see 
that without loss of generality, M = UJU~!. 


Theorem 5.16 For every 9 € PSL(2,R), there exists 


1. an element nx(z) = z+-x for some x €R, 
2. an element a,(z) = r7z for some r > 0, and 


3. an element kg(z) = (sue se for some0) <6 <2 


such that 9 = kg 0 ay o nx. Moreover, 0,1, n are uniquely determined by 9. 


Proof Write x9 + yoi = go '(i) for some x9 € R, yo > O. If we define g’ = 
~ O Nyy O yy, then g’(i) = —(xo + yoi) = i. Since g’ has a fixed point in H?, it is 
either the identity or an elliptic element—in either case, it can be written as 


17, _. (cos(O/2) — sin(@/2) 
9) = ae cos(0/2) ) os 


for some 0 < @ < 2a—-see the exercise at the end of Section 4.6. Therefore, 
yg =kgoa,-1 on_xo, establishing the existence of the desired decomposition. Why 
Yo 


is it unique? Well, g~! = n_,0a1/,ok27~-9,809~ | (i) = (N_x0a1/,)(i) = —x+i/r, 


so x and r are certainly uniquely determined by g. The uniqueness of 0 follows 
immediately. 


Corollary 5.2 (Iwasawa Decomposition) Consider the subgroups 
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Fig.5.22 An illustration of how to obtain the solid torus by rotating an open disk around the y-axis. 


_ | fcos(@) — sin(@) 
= ae cos(0) )/o 7 R| 


of SL(2, C). For every M € SL(2,C), there exist unique matrices k € K,a € A, 
n € N such that M = kan. 


Remark 5.6 Iwasawa decompositions are far more broadly applicable than just for 
SL(2, R)—to start with, every group SL(n, R) has an Iwasawa decomposition, but 
even this is barely scratching the surface. 


Proof J leave this one to the reader. (See Exercise 5.2.18.) 


There are many nice algebraic consequences for the Iwasawa decomposition. In 
the general spirit of this book, we conclude with a pretty geometric application. 


Corollary 5.3 There exists a bi-continuous map (that is, continuous with a contin- 
uous inverse) between SL(2, R) and a solid torus. 


Proof We shall prove that there is a bi-continuous map S! x R? > SL(2, R) (where 
S! is the unit circle) and a bi-continuous map from S! x R? to the solid torus—the 
desired bi-continuous map can then be obtained as a composition. For the first part, 
we define a function 
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S'xRxR-—- SL(2,R) 


j cos(9) —sin(@)\ fe’ 0 1x 
ae a cos(6) ) (6 _) (, ao 


That this is bijective is simply the content of the Iwasawa decomposition. That it 
is continuous is clear—it is a composition of continuous functions like addition, 
multiplication, and exponentiation. Is the inverse continuous? We could compute it 
(see Exercise 5.1.6) and see this directly, but here’s a different argument. Suppose 
that we change the entries of a matrix M € SL(2, R) by a small amount, giving a 
matrix M € SL(2, R). Then w = M—!.i must be close to w = M7" i, hence in the 
Iwasawa decomposition of M, the constants r and x are close to what they are for 
M. This in turn implies that the constant 0 is close for M and M , which allows us to 
conclude that the map is continuous. 

Now, why is S! x R? bi-continuous with a solid torus? Well, one way to think 
about the solid torus T is that it is what you get if you rotate an open disk like 
(x — 2)? + y* < 1 around the y-axis—see Figure 5.22 for an illustration. If we can 
find a bi-continuous function F : R? — D2“, we will be done, because then we geta 
bi-continuous function S! x R? > $! x D?, which we compose with 


Sly DHT 
(ec? x, y) > ((x + 2) cos(9), y, sin(4)) 


to get the desired bi-continuous function. (This latter map is just what we get by 
rotating a point in the open disk around the y-axis by @ radians.) Can you map 
R* — D? in a bi-continuous way? Of course—identify R* with C as we have 
throughout this book, and define 


f:C>D 


oe. 
VIF kz 
Does this really have the image that we say it does? Yes: 
| 


Zr 


Iz . 
1+ |z|? 


lf@P = 


Is this function invertible? Yes: define 


g: et 


z 


f1=(|zP 


ama 


Then 
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Fig.5.23 In some sense, this is a picture of SL(2, R). 


_,(_2_\_ Few es 
(fog)Z@)= (75s) Jl a: a V1 = |zl? + Iz 


and 


= z Vite z = 
(g no=( sae) ses ist V1 + (zi? = |z? 


(2 
so they are inverses. That they are both continuous is also clear, and so we are 
done. 


Naturally, this result implies that SL(2, R) is bi-continuous with anything that 
is bi-continuous with a solid torus, which includes anything that you could get by 
continuously deforming it. A tea mug—like in Figure 5.23—-would be an example. 
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Problems 
5.1 COMPUTATIONAL EXERCISES 


1. Consider the points A = ei, B = e”'/°, C =i in H’, along with the triangle 
AABC. 


a) Show that AA BC is aright triangle, in that the angle at C has angle measure 
a /2. 

b) Let a be the length of BC; b be the length of CA; c be the length of AB. 
Compute a, b,c. 

c) How does a? + b* compare with c”? What does this tell you about the 
Pythagorean theorem in hyperbolic geometry? 


2. Find a regular hexagon P in the hyperbolic plane such that the sum of the mea- 
sures of its angles is 37. 

3. Let P be a regular hexagon in the hyperbolic plane, such that the sum of the 
measures of its angles is 37. Compute its area. 

4. Suppose that we compute an element SL(2, R) randomly, as follows. First, roll 
a 6-sided dice and divide the result by 3—do this three times to obtain three real 
numbers a, b, d. Then, define c = (ad — 1)/b so that M = (44) € SL(2,R). 


a) What is the probability that M is elliptic? 
b) What is the probability that M is hyperbolic? 
c) What is the probability that M is parabolic? 


5. Consider the horocycle H in H? defined by x? + (y — 1)? = 1. Determine the 
set of elements g € PSL(2, R) such that g(H) = H. 
6. Consider the function 


S'xRxRPw SLQ,R) 


j cos(9) —sin(@)\ fe’ 0 1x). 
(c ’. 1.x) os (oo cos(0) ) ( -) 6 ar 


here, S! denotes the unit circle. Compute the inverse of this function. 


5.2 PROOFS 


1. Choose any point z in the hyperbolic plane and consider the set of lines passing 
through /. Prove that the curves that are orthogonal to each of the aforementioned 
lines are hyperbolic circles centered at z. (Hint: you may wish to choose z to be 
some particularly convenient point.) 


5.7 


ise) 
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. Let AABC, AA’B'C' be idealized hyperbolic triangles with all of their vertices 


on the boundary. Prove that there exists a unique isometry g such that (A) = A’, 
g(B) = B’, and g(C) =C’. 


. Let AABC be a hyperbolic triangle. Let / be an angle bisector of the angle at 


the vertex A—that is, a hyperbolic line / that passes through A such that the 
angle from AB to / is equal to the angle from / to CA. Prove that / intersects 
BC. (Hint: the fact that | is an angle bisector is mostly irrelevant—we just need 
to know that | passes through A and is between AB and CA.) 


. Let AABC be a hyperbolic triangle. Let / be the angle bisector of A and /’ the 


angle bisector of B. Prove that / and /’ intersect at some point p. This point is 
known as the incenter. 


. Let AABC be a hyperbolic triangle in H*. Prove that there exists a unique 


isometry y € Isom(H’, dhyper) Such that g(A) = i, g(B) = it witht > 1, and 
g(C) = x +iy with x, y > 0. We shall say that g puts AABC into standard 
position. 


. We prove a result known as the hyperbolic law of cosines. Let AABC be a 


hyperbolic triangle. Denote its angle measures by ZA, ZB, and ZC and the 
lengths of its sides opposite to those angles to be a, b, and c. 


a) Prove that there exists an isometry g sending AABC into D* such that 
g(A) = 0,0 < @(C) < 1, and S(g(B)) > 0. Since hyperbolic isome- 
tries preserve both angles and distances, we shall henceforth simply assume 
that A =0,0 <C < 1,and 3(B) > 0. 

b) Prove that C = tanh(b/2) and B = tanh(c/2)e'<4. 

c) Prove that 


cosh(a) = cosh(b) cosh(c) — cos(ZA) sinh(b) sinh(c). 


(Hint: cosh(a) = cosh(dhyper(B, C). You will need to use a lot of identities 
of hyperbolic functions to simplify everything.) 


. Let AABC be a hyperbolic triangle. Denote its angle measures by ZA, ZB, 


and ZC and the lengths of its sides opposite to those angles to be a, b, and c. 
From the hyperbolic law of cosines, we know that 
cosh(b) cosh(c) — cosh(a) 


cos(ZA) = sinh(d) sinh(c) 


a) Prove that 
) sin?(ZA) _ 1- a2 _ ap _ ne + 2aqgGpae 


sinh(a)?—ssinh(a)? sinh(b)? sinh(c)2” 


where a, = cosh(a), ap = cosh(b), and a. = cosh(c). 
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b) Prove that 
sin(ZA) _ sin(ZB) _ sin(ZC) 
sinh(a) — sinh(b) __sinh(c) * 


This is known as the hyperbolic law of sines. 


8. Let AABC, AA‘B’C’ be hyperbolic triangles such that the length of AB is 
equal to the length of A’B’, ZA = ZA’, and 7B = ZB’. Prove that there exists 
a hyperbolic isometry g such that 9(A) = A’, g(B) = B’, and g(C) = C’. This 
is the ASA theorem. (Hint: put AABC and AA'B'C' into standard position.) 

9. Let AABC, AA’B’C’ be hyperbolic triangles such that the length of AB is the 
equal to the length of A’B’, the length of BC is equal to the length of B’C’, and 
ZB = ZB’. Prove that there exists a hyperbolic isometry g such that g(A) = A’, 
o(B) = B’, and g(C) = C’. This is the SAS theorem. (Hint: put AABC and 
AA’B'C' into standard position.) 

10. Let AABC, AA'B'C’ be hyperbolic triangles such that the length of AB is the 
equal to the length of A’B’, the length of BC is equal to the length of B/C’, 
and the length of CA is equal to the length of C’A’. Prove that there exists a 
hyperbolic isometry @ such that (A) = A’, o(B) = B’, and g(C) = C’. This 
is the SSS theorem. (Hint: put AABC and AA'B’C’ into standard position.) 

11. Let AABC, AA’B’C’ be hyperbolic triangles such that 7A = ZA’, the length 
of AB is equal to the length of A’B’, and ZC = ZC’. Prove that there exists an 
isometry g such that (A) = A’, p(B) = B’, and g(C) = C’. This is the AAS 
theorem. (Hint: the law of sines and the law of cosines might be useful here.) 

12. Prove that for any hyperbolic triangle AA BC, there exists a unique circle that is 
tangent to AB, BC, and CA. This is known as the incircle of AABC. (Hint: the 
center of this circle is the incenter. The proof is exactly the same as in Euclidean 
geometry; you will need to use the AAS theorem.) 

13. Let AABC, AA'B'C’' be hyperbolic triangles, and suppose that 7A = ZA’, 

B = ZB',and ZC = ZC’. We aim to prove that there exists an isometry 9 
such that (A) = A’, y(B) = B’, and g(C) = C’—that is, the AAA theorem. 
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a) Using an isometry if necessary, we can assume that AABC, AA’B’C’ are 
hyperbolic triangles in D’, their incenters are 0, and AB is orthogonal to R. 
That is, they are in the following standard configuration. 


Let r be the hyperbolic radius of the incircle. Prove that the inversive coor- 
dinates of AB, CA, and BC are 


(sinh(r), sinh(7), cosh(r)) 
(sinh(r), sinh(r), cosh(r)e'”) 
(sinh(r), sinh), cosh(r)e") 
for some 0, ¢. 
b) Prove that 
cos(@) cosh(r) - sinh(r)? = cos(ZA) 
cos(¢) cosh(r)" - sinh(r)? = cos(ZB) 
cos(? — d) cosh(r)* — sinh(r)? = cos(ZC). 


(Hint: use the inversive product.) 
c) Define 2g = 1 — cos(ZA), Ap = 1 — cos(ZB), A4¢ = 1 — cos(ZC). Prove 
that 
ha 
cosh(r)? 
Ap 
cosh(r)2 
he 
cosh(r)?° 


cos(?) = 1+ 
cos(¢) = 1+ 


cos(@ — ¢) =1+ 


d) Prove that 


2 Qadhbre 
cosh(r)* = 5 5 5 - ; x 
Na ah Ab + Ae ~~ Qa = Ap) + (Ap = Ac) = Qe _ ha) 
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14. 


15. 
16. 
17. 
18. 
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e) Prove that AABC = AA'B'C’. (Hint: the previous part shows that cosh(r) is 
uniquely determined by the angles. Use this to show that the sides are also 
uniquely determined.) 


Prove that a hyperbolic circle is the incircle of some hyperbolic triangle if and 
only if its hyperbolic radius is less than In(3)/2. (Hint: use an isometry to reduce 
to the case where the center of the circle is 0 € D*.) 

Complete the proof of Theorem 5.5. 

Prove Theorem 5.7. 

Prove Theorem 5.11. 

Prove the existence and uniqueness of the Iwasawa decomposition for SL (2, R)— 
that is, Corollary 5.2. (Hint: look at how Corollary 5.1 was proved from the 
corresponding result about PSL(2, C). Mimic this proof strategy.) 


5.3. PROOFS (Calculus) 


1. 


Consider the function f(x) = x + 1/x defined on (0, 00). Prove that f’(x) > 0 
ifx > 1, f’(x) < Oifx < 1,and f’(x) = Oif x = 1. Use this to show that f (x) 
has a global minimum at x = 1. 


. We wish to define what the hyperbolic area for a region in Hi* is. Consider a 


Euclidean rectangle with vertices A = (x, y), B = (x+Ax, y),C = (x, y+ Ax), 
D = (x + Ax, y+ Ay). If Ax and Ay are very small, the hyperbolic area of 
this region should be approximately the hyperbolic length from A to B times the 
hyperbolic length from B to C. 


a) Prove that 


1 
lim (rye (x +iy,x +i(y + A)y) — —dguctid & + iy, x + i(y + a)»)) =0 
Ay—0 y 


1 
lim (dryer (x + iy, (x + Ax) + iy) — ~dguctia & + iy, x + Ax + i») = 0. 
Ax>0 » 


b) Given the result of the preceding part, why is the definition that the area of 


RCH’ is 
[= 
rR y? 


sensible? 
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3. We compute the hyperbolic area of idealized triangles. 


a) For any R C Hi? and any isometry g € Isom(H’, dhyper), Why is 


[-/ dx dy , 
R y? 9(R) y? , 


(Hint: there are two ways to go about this. You can argue by thinking about 
what happens to small rectangles under hyperbolic isometries. If you are 
familiar with the basics of differential forms, it can also be done simply via 
the usual change of coordinates formulas. In either case, it might be helpful 
to consider basic kinds of elements in Isom(H?, dhyper)-) 

b) Let R be the subset of Hi? bounded by the lines x = —1, x = | and the circle 


x? + y? = 1. Prove that 
[= 
57 =U. 
R Yy 


(Hint: the hardest part here is to set up the bounds of the integral. Simplify 
your life by dividing R into three pieces by cutting along y = 1.) 

c) Prove that any idealized hyperbolic triangle with all of its vertices on the 
boundary has area z. (Hint: use the result of Exercise 5.2.2.) 


5.4 PROOFS (Group Theory) 


1. Let G be a group acting on a set X. A fixed point of G is a point x € X such that 
g.x =x forallg €G. 


a) Prove that the action of S! on C via matrix multiplication has exactly one 
fixed point. What is it? 

b) Prove that the action of SL (2, R) on R? via matrix multiplication has exactly 
one fixed point. What is it? 

c) Prove that the action of SL(2, R) on H? (via hyperbolic isometries) has no 
fixed points. 


2. Let G be a group acting ona set X. For any point x € X, the stabilizer subgroup 
G, is the set of all g € G such that g.x =x. 


a) Prove that G, is a group. 

b) Prove that G, is the largest subgroup of G for which x is a fixed point. 

c) Determine the stabilizer subgroup of (1, 0) € R? under the action of SL(2, R) 
via matrix multiplication. 

d) Recall that Sym(X) is the group of bijective functions f : X — X with com- 
position as the operation. Show that for any x € X, Sym(X), is isomorphic 
as a group to Sym(X \{x}). 
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e) Prove that if the action is transitive, then all of the stabilizer subgroups are 
isomorphic to one another. 


3. Consider PSL(2, R) acting on H? via hyperbolic isometries. Prove that the sta- 
bilizer subgroup of any point in H? is the set of elliptic elements in PSL(2, R) 
that are rotations around that point. 

4. Consider P SL (2, C) acting on H? via hyperbolic isometries. What is the stabilizer 
subgroup of a point in HI>? 


Set Theory 


Throughout this book, I make heavy use of set theoretic notation. For our purposes, 
we do not need to dig into the abstract formalism of set theory; instead, we will make 
use of what is commonly known as naive set theory. Those interested in learning 
more might consult Halmos’ Naive Set Theory [5] or Jech’s Set Theory [7]. 


A.1_ Basic Constructions 


To begin with, what is a set? Roughly speaking, a set is a collection. Its defining 
characteristics are what its elements are. For instance, one can have sets like 


S; = {1,2,5, 11} 
Sp = ('b’,“e’, “f"}, 
where the first set has four elements 1, 2,5, 11, and the second has three elements 


6a? 6 


a’, ‘e’, ‘f’. The elements are not ordered, so, for example, 
{1,2,5, 11} = {5, 2, 11, 1}. 


Two sets are the same if and only if they have the same elements. If x is an element 
of a set S, then we express this as x € S. If x is not an element of S, then we write 
this as x ¢ S. Thus, 1 € S$), 11 € S$), but 13 ¢ S,. These elements can be anything 
whatsoever, including other sets—so, for instance, 


S3 = {0, 2,3, {1, 3, 7H} 


is a perfectly kosher set which has four elements in it, namely 0, 2, 3 and {1, 3, 7}. 
There are various standard operations on sets—for example, we can take the union 
of two sets, which produces a new set that contains precisely all of the elements of 
both sets. So, for example, 
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S, U So = {1, 2,5, 11, ‘b’, ‘e’, “f7} 
S1 U S3 = {0, 1, 2, 3,5, {1, 3, 7}}. 


Another common operation is taking intersections—the intersection of two sets is a 
set containing precisely all of the elements that are in both sets, such as 


S153 = {2} 
S21 $3 = {}, 


where {} is the empty set, also denoted by 4, which is the unique set that has no 
elements. Given a set S, we can always form a set {.$} that just has S as an element. 
For instance, 


{0} = {0} 


is an entirely acceptable set which, it must be stressed, is not the same as J—@ 
contains no elements, but {4} has exactly one. Intuitively, if you think about a set 
like a basket, then 4 is an empty basket, but {@} is a basket that has a basket inside 
of it, which is quite different. 

Given any set S, we say that another set T is a subset—and we write T C S—if 
every element in T is an element of S. We say that T is a proper subset of SifT CS 
and T # S. (Note that if S = T then S C T is trivially true.) Thus, if 


X = {A, ) O} 
X2 = {A, O} 
X3 = {A}, 


then X3 C X2 C X; (in fact, they are all proper subsets), but X1 is not a subset of 
X2 and X2 is not a subset of X3. It is easy to see that S = T if and only if S C T and 
T Cc S—this is also one of the most common ways to prove that two sets are in fact 
the same. There are two very important operations involving subsets. First, given any 
set S, we can produce a new subset T C S that consists precisely out of all elements 
in S satisfying some kind of condition. This is most commonly written using what 
is known as “set-builder” notation, where the set that we are taking a subset of is on 
the left, and the condition we wish to impose is on the right. For example, 


{x € Si|x < 6} = {1, 2, 5}. 


That is, we started with the set S; = {1, 2,5, 11} and imposed the condition that we 
only keep those elements x € S; such that x < 6. Here is another example: 


E = {{0, 1}, {1, 3}, {2}, {4,5, 6}, (1) 
F={SeE|leS} 
= {{0, 1}, {1, 3}, {1}}. 


A.2. Some Common Sets 221 


To start, E is a set of some sets of integers; F is just the subset of all sets that contain 
1. This process of producing subsets by imposing some condition on an existing 
set is known as restricted comprehension. One special instance of this is when we 
remove elements in a set from a larger set that contains it—that is, if S$; C S2 we 
define 


So\S1 = {x € Solx ¢ Si}. 


Another important construction involving subsets is the power set of a set S— 
typically denoted by P(S) or 2°—which is the set of all subsets of S. The rationale 
for the notation 2° is simple: if S is a finite set with n elements, then 2° will have 2” 
elements. For example, 


2%! = {{A, 0, O}, (A, O}, {A, O}, (G, O}, {A}, (Ch (O}, 9} 


has 2? = 8 elements, whereas X; = {A, U1, O} has 3. It is a good exercise for the 
reader to show that this is true in general. 


A.2.> Some Common Sets 


Thus far, I have only listed examples of finite sets. However, most of the sets that 
we will be interested in are not finite. Here are some examples, together with the 
symbols typically used to denote those sets. 


N | the set of natural numbers 

Z the set of integers 

Q the set of rational numbers 

Q* the set of positive rational numbers 
Q* the set of non-zero rational numbers 
R the set of real numbers 

R™ the set of positive real numbers 

R* the set of non-zero real numbers 

C the set of complex numbers 

C™* the set of non-zero complex numbers 


Here, by the natural numbers, I mean the set {0, 1, 2, 3, .. .}. I wish I could honestly 
say that this was completely standard notation, but sadly some authors exclude 0 as 
a natural number. Consequently, to avoid confusion, I will largely try not to mention 
natural numbers. Each of the above-mentioned sets has a standard construction in 
set theory. For instance, using von Neumann ordinals, one could define the natural 
numbers by nesting sets inside of one another, starting with the empty set. This is 
done as follows: 
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0=9 

1 = 0U {0} = {9} 
2=1U {1} = {, {0} 

3 =2U {2} = (@, (G}, (9, (0) 


n=(n—1)U {n — I}. 


One can then use this to define addition and multiplication entirely in terms of sets. 
Constructions such as these are useful in that they allow us to reduce vast swathes 
of mathematics to just set theory. On the other hand, they tend to be both somewhat 
unwieldy and just weird—with the von Neumann definition, it is true thatn — 1 En 
which has never sat right with me. Other approaches are available: one could, for 
example, treat everything axiomatically. For the natural numbers, the standard choice 
of axioms is something like Peano’s axioms. One could also apply other branches of 
mathematics such as category theory to give descriptions of these objects. However, 
I will not worry about these foundational issues. For our purposes, it shall suffice to 
know that these basic objects exist so that we can build on top of them. I think that 
the exact details of how to define them are best left to a different text. 


A.3 Ordered Pairs and Relations 


By definition, sets are unordered. However, we will often need to consider objects 
like sets but where the order of the elements matters, and where an element can 
appear multiple times. There is a very clever definition due to Kuratowski that just 
makes use of set theoretic language—specifically, in 1921 he defined an ordered pair 
as 


(a, b) = {{a}, {a, b}}. 


Kuratowski’s was not the first definition; Weiner and Hausdorff had made their own 
definitions in 1914. It just happens that Kuratowski’s definition is the easiest to work 
with using standard formulations of set theory for the purposes of proving things. 
Regardless, all definitions of ordered pairs share in that they essentially add some 
kind of asymmetry to the set, which allows you to differentiate a “first” element anda 
“second” element, allowing you to prove the characteristic property of ordered pairs, 
namely that (a, b) = (c,d) if and only if a = c and b = d. One can extend this to 
ordered triples, ordered quadruples and the like by just nesting ordered pairs—e.g., 


(a,b,c, d) := (((a,b),c),d). 


Again, the exact mechanics of how we define things doesn’t really matter for our 
purposes—the important thing is the characteristic property that (x1, x2,...%n) = 
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(¥1, Y2,---Yn) if and only if x; = y; for all 1 < i <n. With this, given a collection 
of sets $1, S2,...S,, we can define their Cartesian product to be the set 
n 
xX Si = S1 x Sox... Sh 
i=1 


= {(x1, X2,...Xn)|x1 € S1, x2 © S2,...Xn € Sy}. 


It is customary to write S” to denote the Cartesian product of a set S with itself 
n times. As an example, the set of Cartesian pairs of real numbers is R?—this is 
essentially the set of Cartesian coordinates of the plane, whence the name. 

A binary relation R between two sets $1, S2 is formally defined as a subset of 
S, x S2. However, in practice, we usually think about relations in a slightly different 
way: given two elements x € Sj, y € S2, we say that the relation holds for x, y 
if (x, y) € R. We also usually use a slightly different notation: instead of writing 
(x, y) € R, we instead write x Ry. Thus, you should essentially think of the R of 
S; x Sz as the collection of pairs for which the relation R is true. Here is an example: 
take any set S and define a subset 


R= {(x, y) € Six = y}. 
The relation R is secretly just =. Here is another example: 
R= {(x, y) € Z|x =y+z, for some z € N}. 


Given two integers x, y, xRy if and only if x = y + z for some natural number 
z—that is, if x > y. Here is one last example: 


R= {(p1, p2) € Humans? |p; is a child of pr}. 


For any binary relation on two sets $1, S2, we refer to S; as the domain and S2 as 
the codomain. 


Functions 


A function f on two sets S1, S2 is a special type of relation with the following 
additional requirement: for every x € Sj), there exists a unique y € Sy» such that 
(x, y) € f. Based on the fact that it is unique, we usually write this in the more 
familiar form f(x) = y. To signify that a relation is a function, we typically employ 
a slightly different notation and write f : S; —> S2—here f is the name of the 
function, Sj is its domain, and S> is its codomain. We often want to specify a function 
by some type of rule. The usual notation for this is 
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fi: S1 > So 
xb o(x). 
Here, the first line is telling us the name of the function, its domain and codomain; 
the second line is telling us that for any element x € S,, f(x) returns whatever 
statement g(x) is. For example, given any set S$, we might want to define the identity 
function 
ids: S—->S 
XX 


which has the following behavior: it accepts as inputs elements x € S, and simply 
returns them back, unchanged. We might also have less trivial functions such as 


f[:Z—->R 
nr V1l+n?, 
This should be understood as follows: as a function, f accepts integers and returns 
real numbers. For any integer n, f(n) = V1 +n2. Here is another example: 
E:R*xR>R 
(x, y) He x”. 
Here, the function EF accepts ordered pairs (x, y) of real numbers where the first 
element has to be positive and returns x. Sometimes, we will have functions that 
are easiest to describe piecewise: we might say that they do some simple operation 
g if some condition C} is satisfied, or they might do some other simple operation 
g2 if some other condition C2 is satisfied. Formally, given a function f : S; > S3 
such that $; = C; UC2U...C,, and the sets C; do not intersect one another, we will 
write 
fi Si > S2 
g(x) ifx EC; 
g2(x) ifx €C2 
XR 4. ; 
On(x) ifx EC, 


to mean that f(x) = g(x) if x € Cj. There are many common examples of this, 
such as 


x ifx >0 
Xb 
—x ifx <0 
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which we all know as the absolute value function. We will have our own examples 
of such functions, such as 


|x|: {circles, lines in R*} rR 


1 if C is acircle with radius R 
Cre 
0 if C isaline. 


Another common way of building new functions from old ones is to compose them 
together. Specifically, suppose that I have a function f : X — Y and a function 
g:Y — Z. Then I can define a new function 


h:X>Z 
x > g(fG@)). 


We call h the composition of f and g; we usually denote this relationship by writing 
h=gof. 

Finally, it is common to produce new functions via restriction. Suppose that we 
have a function QO : X — Y and Z C X. Then we can define a new function 

Q|z:Z2> Y 
xr Q(x). 

Intuitively, this is just the same function as before; we have simply shrunk its domain 
a little. 


A.4 Injections, Surjections, and Bijections 


A function f : X — Y is called injective (or one-to-one, or sometimes monic) if for 
every x1,x2 © X, f (x1) = f (x2) implies that x; = x2. A function f : X > Y is 
called surjective (or onto, or sometimes epic) if for every y € Y, there exists x € X 
such that f(x) = y. Intuitively, injections are functions that never map to the same 
element in Y twice; surjections are functions that map to every element in Y. Let’s 
look at some examples. The function 


f[:Z-Z 


nt>2n 


is injective, since 2n = 2m implies that n = m. However, it is not surjective, since 
there is no integer n such that 2n = 1, for instance. The function 


g:RxRPR 
(,y)rexty 
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Fig. A.1 Visual representations of three functions; the first one is injective, the second one is 
surjective, and the third is bijective. 


is surjective, since for any z € R, g((z, 0)) = z. However, it is not injective, since it 
is also true that g((z — 1, 1)) = z, among infinitely other options. A good exercise is 
to go through the functions listed in the previous section and to decide whether they 
are injective, surjective, both, or neither. 

A function that is both injective and surjective is called bijective. Such a function 
is one such that for every y € Y in the codomain, there exists a unique x € X in the 
domain such that f(x) = y. This is to say that any bijective function f : X > Y 
is invertible—there exists another function f~! : Y > X such that f~! 0 f =idx 
and f~!o f = idy. In fact, this goes the other way as well. 


Theorem A.1 Let f : X — Y be a function; it is invertible if and only if it is 
bijective. 


Proof We have already demonstrated that if f is bijective, then it is invertible. It 
remains to show that if there exists a function f~! : Y + X suchthat f~!o f =idx 
and f~!o f = idy then f is bijective. Well, choose any two x1, x2 € X and suppose 
that f (x1) = f (x2). It follows that 


f-' f@)) = £7! Ff @2)) 
idx (x1) = idx (x2) 
X{ = X2. 


Therefore, f is injective. Next, choose any y € Y. Note that if we choose x = 
f7'@) € X, then f(x) = f(f7!(y)) = idy(y) = y. Therefore, f is surjective. 
We conclude that it is bijective. 


Intuitively, a bijection matches up all of the elements in the domain and all of the 
elements in the codomain in a unique fashion; if there is a bijection between two 
sets, then you can essentially think of one set as just being the other one, but with 
all of its elements relabeled, where the bijection is precisely what keeps track of the 
labeling. Some examples of injective, surjective, and bijective functions are drawn 
in Figure A.1. 
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